- Published on March 27, 2025
- In AI News
Alibaba's Qwen2.5-Omni model has shown strong performance across various tasks, including speech recognition, audio, video, and more.

Alibaba, on Wednesday, added voice and video chat capabilities to Qwen Chat, besides releasing its brand new open-source model, Qwen2.5-Omni-7B, which made this possible. It was released as an open-source model under Apache 2.0 licence.
The company highlighted in a blog post that Qwen2.5-Omni is the new flagship end-to-end multimodal model in the Qwen series. It stated that it is designed for multimodal perception and seamlessly processes text, images, audio, and video, delivering real-time streaming responses via text and speech synthesis.
The key features of the model include a ‘Thinker-Talker’ architecture, which allows it to provide real-time responses. The Thinker part of the architecture is a Transformer decoder, which acts like the brain and the Talker, designed as a dual-track autoregressive Transformer decoder, operates like the human mouth.
Alibaba’s Qwen2.5-Omni model has shown strong performance across various tasks, including speech recognition, translation, audio and video understanding, and speech generation, outperforming similar models at tasks that require multiple modalities.
It was compared to similar single-modality and closed-source models like Qwen2.5-VL-7B, Qwen2-Audio, and Gemini-1.5-pro, achieving state-of-the-art performance.
— Qwen (@Alibaba_Qwen) March 26, 2025The paper and code for the new model can be found on GitHub, while the AI model is available on Hugging Face along with a demo.
Last month, Alibaba also launched QwQ-Max-Preview, a new AI reasoning model within the Qwen family that specialises in mathematics and coding tasks and features a “thinking” capability in the Qwen Chat application.
The model, which outperformed OpenAI’s models on the LiveCodeBench leaderboard, is expected to have smaller variants open-sourced for local device deployment, as well as a dedicated mobile app.
There may be a lot more coming, considering Alibaba’s commitment to investing over $52 billion in AI over the next three years.
Ankush Das
I am a tech aficionado and a computer science graduate with a keen interest in AI, Open Source, and Cybersecurity.
Subscribe to The Belamy: Our Weekly Newsletter
Biggest AI stories, delivered to your inbox every week.
AI Startups Conference.April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru, India
Data Engineering Summit 2025
May 15 - 16, 2025 | 📍 Hotel Radisson Blu, Bengaluru
MachineCon GCC Summit 2025
June 20 to 22, 2025 | 📍 ITC Grand, Goa
Sep 17 to 19, 2025 | 📍KTPO, Whitefield, Bengaluru, India
India's Biggest Developers Summit | 📍Nimhans Convention Center, Bengaluru
India's Biggest Summit on Women in Tech & AI 📍 Bengaluru