Microsoft’s phi-4 is a Monstrous Small Model

4 months ago 34
  • Published on December 13, 2024
  • In AI News

It offers performance comparable to multiple leading large language models. 

Microsoft has launched their latest small model, the phi-4, with 14 billion parameters. The model is said to ‘excel’ at complex reasoning capabilities. It is currently available on Azure AI Foundry and will be available on Hugging Face from next week onwards. Microsoft has also released a detailed technical report for phi-4

The phi-4 offers strong competition to leading small language models and also gives large frontier models a run for their money. Microsoft attributes its performance to the use of high-quality synthetic datasets and post-training innovations. In math competition problems, phi-4 outperformed Gemini 1.5 Pro and OpenAI’s GPT-4o. 

Surprise #NeurIPS2024 drop for y'all: phi-4 available open weights and with amazing results!!!

Tl;dr: phi-4 is in Llama 3.3-70B category (win some lose some) with 5x fewer parameters, and notably outperforms on pure reasoning like GPQA (56%) and MATH (80%). pic.twitter.com/nGaOTmuKY3

— Sebastien Bubeck (@SebastienBubeck) December 13, 2024

“Despite minimal changes to the phi-3 architecture, phi-4 achieves strong performance relative to its size—especially on reasoning-focused benchmarks—due to improved data, training curriculum, and innovations in the post-training scheme,” read the technical report from Microsoft. 

Notably, the phi-4 model also offers performance levels inside the region of Meta’s newly released Llama 3.3 models. In fact, the phi-4, as per benchmarks, offers better performance compared to Llama 3.3 in reasoning and math capabilities. 

phi-4 is Microsoft’s successor to the phi-3.5 models that were released earlier this year. 

Microsoft’s announcement comes just days after Google launched their small model, the Gemini 2.0 Flash. While Microsoft hasn’t officially compared phi-4 with Gemini 2.0 Flash, it achieved a 62.1% score in the GPQA reasoning benchmark, compared to phi-4’s 56.1% score. 

Google is also going toe-to-toe with Microsoft with their latest Project Mariner, which not only rivals the Copilot Vision but goes a step further. Unlike Copilot Vision, Project Mariner is also capable of autonomously navigating a web browser tab. 

phi-4 will also compete with Anthropic Claude’s Haiku 3.5, which was made available via the web and mobile app for all users yesterday. As per benchmarks, the phi-4 model outperforms Claude 3.5 Haiku on several benchmarks. 

Small models may finally deliver the set promise. It is about time we see them on more and more devices that let users access AI models locally.

Picture of Supreeth Koundinya

Supreeth Koundinya

Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.

Association of Data Scientists

GenAI Corporate Training Programs

India's Biggest Developers Summit

February 5 – 7, 2025 | Nimhans Convention Center, Bangalore

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

February 5 – 7, 2025 | Nimhans Convention Center, Bangalore

Rising 2025 | DE&I in Tech & AI

Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru

Data Engineering Summit 2025

May, 2025 | 📍 Bangalore, India

MachineCon GCC Summit 2025

June 2025 | 583 Park Avenue, New York

September, 2025 | 📍Bangalore, India

MachineCon GCC Summit 2025

The Most Powerful GCC Summit of the year

discord icon

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Read Entire Article