Hugging Face’s SmolLM models bring powerful AI to your phone, no cloud required


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Hugging Face today unveiled SmolLM, a new family of compact language models that surpass similar offerings from Microsoft, Meta, and Alibaba’s Qwen in performance. These models bring advanced AI capabilities to personal devices without sacrificing performance or privacy.

The SmolLM lineup features three sizes — 135 million, 360 million, and 1.7 billion parameters — designed to accommodate various computational resources. Despite their small footprint, these models have demonstrated superior results on benchmarks testing common sense reasoning and world knowledge.

Small but mighty: How SmolLM challenges AI industry giants

Loubna Ben Allal, lead ML engineer on SmolLM at Hugging Face, emphasized the efficacy of targeted, compact models in an interview with VentureBeat. “We don’t need big foundational models for every task, just like we don’t need a wrecking ball to drill a hole in a wall,” she said. “Small models designed for specific tasks can accomplish a lot.”

The smallest model, SmolLM-135M, outperforms Meta’s MobileLM-125M despite training on fewer tokens. SmolLM-360M surpasses all models under 500 million parameters, including offerings from Meta and Qwen. The flagship SmolLM-1.7B model beats Microsoft’s Phi-1.5, Meta’s MobileLM-1.5B, and Qwen2-1.5B across multiple benchmarks.

Untitled
A comparison of language model performance across various benchmarks. Hugging Face’s new SmolLM models, in bold, consistently outperform larger models from tech giants, demonstrating superior efficiency in tasks ranging from common sense reasoning to world knowledge. The table highlights the potential of compact AI models to rival or surpass their more resource-intensive counterparts. (Image Credit: Hugging Face)

Hugging Face distinguishes itself by making the entire development process open-source, from data curation to training steps. This transparency aligns with the company’s commitment to open-source values and reproducible research.

The secret sauce: High-quality data curation drives SmolLM’s success

The models owe their impressive performance to meticulously curated training data. SmolLM builds on the Cosmo-Corpus, which includes Cosmopedia v2 (synthetic textbooks and stories), Python-Edu (educational Python samples), and FineWeb-Edu (curated educational web content).

“The performance we attained with SmolLM shows how crucial data quality is,” Ben Allal explained in an interview with VentureBeat. “We develop innovative approaches to meticulously curate high-quality data, using a mix of web and synthetic data, thus creating the best small models available.”

SmolLM’s release could significantly impact AI accessibility and privacy. These models can run on personal devices like phones and laptops, eliminating cloud computing needs and reducing costs and privacy concerns.

Democratizing AI: SmolLM’s impact on accessibility and privacy

Ben Allal highlighted the accessibility aspect: “Being able to run small and performant models on phones and personal computers makes AI accessible to everyone. These models unlock new possibilities at no cost, with total privacy and a lower environmental footprint,” she told VentureBeat.

Leandro von Werra, Research Team Lead at Hugging Face, emphasized the practical implications of SmolLM in an interview with VentureBeat. “These compact models open up a world of possibilities for developers and end-users alike,” he said. “From personalized autocomplete features to parsing complex user requests, SmolLM enables custom AI applications without the need for expensive GPUs or cloud infrastructure. This is a significant step towards making AI more accessible and privacy-friendly for everyone.”

The development of powerful, efficient small-scale models like SmolLM represents a significant shift in AI. By making advanced AI capabilities more accessible and privacy-friendly, Hugging Face addresses growing concerns about AI’s environmental impact and data privacy.

With today’s release of SmolLM models, datasets, and training code, the global AI community and developers can now explore, improve, and build upon this innovative approach to language models. As Ben Allal said in her VentureBeat interview, “We hope others will improve this!”



Source link

About The Author

Scroll to Top