Alibaba Releases Qwen2.5 Omni, Adds Voice and Video Modes to Qwen Chat

3 weeks ago 12

Published on March 27, 2025
In AI News

Alibaba's Qwen2.5-Omni model has shown strong performance across various tasks, including speech recognition, audio, video, and more.

Alibaba, on Wednesday, added voice and video chat capabilities to Qwen Chat, besides releasing its brand new open-source model, Qwen2.5-Omni-7B, which made this possible. It was released as an open-source model under Apache 2.0 licence.

The company highlighted in a blog post that Qwen2.5-Omni is the new flagship end-to-end multimodal model in the Qwen series. It stated that it is designed for multimodal perception and seamlessly processes text, images, audio, and video, delivering real-time streaming responses via text and speech synthesis.

The key features of the model include a ‘Thinker-Talker’ architecture, which allows it to provide real-time responses. The Thinker part of the architecture is a Transformer decoder, which acts like the brain and the Talker, designed as a dual-track autoregressive Transformer decoder, operates like the human mouth.

Alibaba’s Qwen2.5-Omni model has shown strong performance across various tasks, including speech recognition, translation, audio and video understanding, and speech generation, outperforming similar models at tasks that require multiple modalities.

It was compared to similar single-modality and closed-source models like Qwen2.5-VL-7B, Qwen2-Audio, and Gemini-1.5-pro, achieving state-of-the-art performance.

— Qwen (@Alibaba_Qwen) March 26, 2025

The paper and code for the new model can be found on GitHub, while the AI model is available on Hugging Face along with a demo.

Last month, Alibaba also launched QwQ-Max-Preview, a new AI reasoning model within the Qwen family that specialises in mathematics and coding tasks and features a “thinking” capability in the Qwen Chat application.

The model, which outperformed OpenAI’s models on the LiveCodeBench leaderboard, is expected to have smaller variants open-sourced for local device deployment, as well as a dedicated mobile app.

There may be a lot more coming, considering Alibaba’s commitment to investing over $52 billion in AI over the next three years.