Large language models (LLMs) like OpenAI’s GPT-4 have gained acclaim for their versatility across various tasks, but they come with significant resource demands. In response, the AI industry is shifting focus towards smaller, task-specific models designed to be more efficient.
Thank you for reading this post, don't forget to subscribe!Microsoft, alongside other tech giants, is investing in these smaller models. Science often involves breaking complex systems down into their simplest forms to understand their behavior. This reductionist approach is now being applied to AI, with the goal of creating smaller models tailored for specific functions.
Sébastien Bubeck, Microsoft’s VP of generative AI, highlights this trend: “You have this miraculous object, but what exactly was needed for this miracle to happen; what are the basic ingredients that are necessary?”
In recent years, the proliferation of LLMs like ChatGPT, Gemini, and Claude has been remarkable. However, smaller language models (SLMs) are gaining traction as a more resource-efficient alternative. Despite their smaller size, SLMs promise substantial benefits to businesses.
Microsoft introduced Phi-1 in June last year, a smaller model aimed at aiding Python coding. This was followed by Phi-2 and Phi-3, which, though larger than Phi-1, are still much smaller than leading LLMs. For comparison, Phi-3-medium has 14 billion parameters, while GPT-4 is estimated to have 1.76 trillion parameters—about 125 times more. Microsoft touts the Phi-3 models as “the most capable and cost-effective small language models available.”
Microsoft’s shift towards SLMs reflects a belief that the dominance of a few large models will give way to a more diverse ecosystem of smaller, specialized models. For instance, an SLM designed specifically for analyzing consumer behavior might be more effective for targeted advertising than a broad, general-purpose model trained on the entire internet.
SLMs excel in their focused training on specific domains. “The whole fine-tuning process … is highly specialized for specific use-cases,” explains Silvio Savarese, Chief Scientist at Salesforce, another company advancing SLMs.
To illustrate, using a specialized screwdriver for a home repair project is more practical than a multifunction tool that’s more expensive and less focused.
This trend towards SLMs reflects a broader shift in the AI industry from hype to practical application. As Brian Yamada of VLM notes, “As we move into the operationalization phase of this AI era, small will be the new big.” Smaller, specialized models or combinations of models will address specific needs, saving time and resources.
Some voices express concern over the dominance of a few large models, with figures like Jack Dorsey advocating for a diverse marketplace of algorithms. Philippe Krakowski of IPG also worries that relying on the same models might stifle creativity.
SLMs offer the advantage of lower costs, both in development and operation. Microsoft’s Bubeck emphasizes that SLMs are “several orders of magnitude cheaper” than larger models. Typically, SLMs operate with around three to four billion parameters, making them feasible for deployment on devices like smartphones.
However, smaller models come with trade-offs. Fewer parameters mean reduced capabilities. “You have to find the right balance between the intelligence that you need versus the cost,” Bubeck acknowledges.
Salesforce’s Savarese views SLMs as a step towards a new form of AI, characterized by “agents” capable of performing specific tasks and executing plans autonomously. This vision of AI agents goes beyond today’s chatbots, which can generate travel itineraries but not take action on your behalf.
Salesforce recently introduced a 1 billion-parameter SLM that reportedly outperforms some LLMs on targeted tasks. Salesforce CEO Mark Benioff celebrated this advancement, proclaiming, “On-device agentic AI is here!”