Microsoft’s Small Language Model, Phi-4 is Now Available for Free

3 months ago 40

Published on January 8, 2025
In AI News

The company has made the small language model available on Hugging Face and supports ten Indian languages too!

Illustration by Nikhil Kumar

Microsoft has finally made its latest small language model, the Phi-4, available for free on HuggingFace. The 14 billion-parameter model can now be downloaded, fine-tuned, and deployed for free.

Why does it matter? Microsoft’s Phi-4 is quite a small model, and yet it outperforms the Llama 3.3 70B (which is nearly five times bigger) and OpenAI’s GPT-4o Mini in several benchmarks. In math competition problems, Phi-4 outperformed Gemini 1.5 Pro and OpenAI’s GPT-4o.

Microsoft’s detailed technical paper discusses numerous techniques and the curation of some of the highest-quality datasets used to train the model. The model is said to excel at complex reasoning capabilities.

In an exclusive interview with AIM, Harkirat Behl, one of the creators of the model, said: “Big models are trained on all kinds of data and store information that may not be relevant.”

He added that with sufficient effort in curating high-quality data, it is possible to match the performance levels of these models – and perhaps even surpass them.

Interestingly, Microsoft has not experimented with inference optimisation with the Phi-4, and the focus is mainly on synthetic data. He revealed that once the model architecture is released, developers will be able to optimise it further and quantise it to run it on devices for local use on PCs and laptops.

After Meta, Microsoft is one of the other big companies making significant strides in building open-weight models. Phi-4’s predecessor, Phi-3.5, was also made available for free on HuggingFace.

That said, Meta, or even Microsoft for that matter, doesn’t stand on top of the open-source model race; the China-based DeepSeek-V3 holds the position.

Although it is a much larger model with 671B parameters, it outperformed Meta’s flagship Llama 3.1 405B parameter model, among many other closed-source models. It is also three times faster than its predecessor, the DeepSeek V2.

Behl also said that Phi-4 supports 10 Indian languages. “I personally made sure and worked hard to get Phi-4 to interpret ten most common Indian languages”.

Of course, the company is betting big on India. Yesterday. Microsoft CEO Satya Nadella was in Bangalore, India, for the company’s AI Tour. He announced Microsoft’s largest investment in India yet, a $3 billion commitment to expand Azure’s infrastructure in the country. Moreover, the company is set to train 10 million people in AI by 2030 as a part of its ADVANTA(I)GE INDIA initiative.

Last week, Nadella also met Telangana Chief Minister A. Revanth Reddy in Hyderabad to discuss the state’s technology priorities, including AI, generative AI, and cloud development.

Supreeth Koundinya

Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.

Association of Data Scientists

GenAI Corporate Training Programs

India's Biggest Developers Summit

February 5 – 7, 2025 | Nimhans Convention Center, Bangalore

Download the easiest way to
stay informed

Top 10 Talks from AIM Conferences in 2024

Siddharth Jindal

Former CEO CP Gurnani revealed that Tech Mahindra developed an Indian LLM for local languages and over 37 dialects in just five months with a budget of under $5 million.