Phi-3-Mini is the first in a family of small language models Microsoft plans to release over the coming weeks. Phi-3-Small and Phi-3-Medium are in the works. In contrast to large language models like OpenAI’s ChatGPT and Google’s Gemini, small language models are trained on much smaller datasets and are said to be much more affordable for users.

We are excited to introduce Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding and math benchmarks.

Misha Bilenko Corporate Vice President, Microsoft GenAI

What are they for? For one thing, the reduced size of this language model may make it suitable to run locally, for example as an app on a smartphone. Something the size of ChatGPT lives in the cloud and requires an internet connection for access.

While ChatGPT is said to have over a trillion parameters, Phi-3-Mini has only 3.8 billion. Sanjeev Bora, who works with genAI in the healthcare space, writes: “The number of parameters in a model usually dictates its size and complexity. Larger models with more parameters are generally more capable but come at the cost of increased computational requirements. The choice of size often depends on the specific problem being addressed.”

Phi-3-Mini was trained on a relatively small dataset of 3.3 trillion tokens — instances of human language expressed numerically. But that’s still a lot of tokens.

Why we care. While it is generally reported, and confirmed by Microsoft, that these SLMs will be much more affordable than the big LLMs, it’s hard to find exact details on the pricing. Nevertheless, taking the promise at face-value, one can imagine a democratization of genAI, making it available to very small businesses and sole proprietors.

We need to see what these models can do in practice, but it’s plausible that use cases like writing a marketing newsletter, coming up with email subject lines or drafting social media posts just don’t require the gigantic power of a LLM.

