Tencent Launches Hunyuan Large, Outperforms Llama 3.1 70B & 405B

5 months ago 33

Last updated November 5, 2024
In AI News

Hunyuan-Large is China’s competitor to Meta’s Llama. It outperforms Llamma 3.1 70B and is on par with the flagship Llama 3.1 405B.

Chinese giant Tencent has just released a large 389 billion parameter open source model called Hunyuan Large – with 52 billion active parameters. The model supports a context length of 256,000 tokens and is one of the largest open-source models in its category. In comparison, both Llama 3.1 70B and 405B models support a 128,000 context length.

If you’re competing in the open-source arena, you’ve got to dethrone the king. Interestingly, the Hunyuan-Large outperforms the Llama3.1 70B model on several benchmarks in English and Chinese. The model’s performance was also comparable with Meta’s flagship Llama 3.1-405B model on tasks involving language understanding, coding, maths and logical reasoning.

Unlike the Llama 3.1 405B, Hunyuan Large isn’t a ‘dense’ model. This means that it doesn’t use all of its parameters for each input. Tencent explores Mixture of Experts (MoE) scaling laws to guide an optimal balance between model size, data volume, and performance. An MoE model activates only a subset of parameters based on its input. This makes it more efficient as it only uses a part of the model’s capacity each time.

Hunyuan Large incorporates several ‘innovative’ techniques to outperform its competition. This includes using 1.5 trillion tokens of higher-quality synthetic data, which is part of the 7 trillion parameters that the model is trained over. The model also incorporates various model structure enhancement techniques to reduce memory usage, increase performance, and balance token usage.

Tencent compared Hunyuan Large against leading open-source models in both pre-and post-training stages. In most results, Hunyuan Large came out on top in comparison with other dense, and MoE models with similar parameter sizes. The authors mentioned, “We also hope that the release of the largest and overall best performing MoE-based Hunyuan-Large could spark more ripple of debate about more promising techniques of LLMs among the community, in turn, to further improve our model from a more practical aspect and contribute to the more helpful AGI in the future.”

Tencent’s latest announcement comes after the news that China has adopted Meta’s open-source models for building a chatbot for military applications. This ensued a debate between Vinod Khosla and Yan LeCun, with the former criticising Meta for providing ease of access to LLMs. LeCun retaliated that China is quite competent with the United States in generative AI, and they wouldn’t entirely depend on Meta’s open-source models to develop any consequential technology. With the release of Hunyuan Large, Yann LeCun may just be right.

Interestingly, Meta has also announced that it is making Llama available to the US government, and any other private organisations working in the interests of national security.

Supreeth Koundinya

Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.

8th Nov 2024
Meet 30+ top CDOs, ITDMs & AI Leaders.

Upcoming Large format Conference

India's Biggest Developers Summit

February 5 – 7, 2025 | Nimhans Convention Center, Bangalore

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Rising 2025 | DE&I in Tech & AI Summit

Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

February 5 – 7, 2025 | Nimhans Convention Center, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

September 25-27, 2024 | 📍Bangalore, India

25 July 2025 | 583 Park Avenue, New York

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.

World's Biggest Media & Analyst firm specializing in AI

AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.

AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.

ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent

With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.

Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring

AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives.