Bengaluru’s Sarvam AI Launches Mayura to Improve English-to-Indic Language Translation

7 months ago 79
  • Last updated September 7, 2024
  • In AI News

Mayura is now available as an API, allowing developers and businesses to integrate it into their applications.

Bengaluru-based startup Sarvam AI has launched a new translation model called Mayura to address long-standing challenges in English-to-Indic language translation, particularly in everyday, colloquial communication.

The model focuses on real-world language patterns such as code-mixing and colloquial expressions, aiming to deliver translations that are more relatable and accurate for Indian multilingual speakers. It supports translations between 10 major Indian languages, including Hindi, Tamil, Telugu, Malayalam, Punjabi, Odia, Gujarati, Marathi, Kannada, and Bengali.

Conventional translation models have struggled to capture the nuances of how Indians actually communicate, as they often mix regional languages with English. Sarvam Translate, however, was developed with a unique approach: it utilizes diverse real-world data to train the model, ensuring that regional dialects, slang, and even code-mixed phrases are preserved in translations. 

This development is set to enhance the accessibility of digital content, social media, and e-commerce services for millions across India.

The model also tackles gender-specific language in Indic languages, a critical area where traditional models fall short. By including a “gender toggle” for first-person translations, Sarvam Translate provides appropriate gender representation in conversations, both for AI-powered voice chatbots and human agents.

The application of this model extends beyond casual conversations. It’s also designed for domain-specific translations, like legal, scientific, and technical documents, ensuring that complex jargon is accurately translated into simple, accessible Indic language. Sarvam AI’s dual-stream architecture preserves formatting elements in technical content, making it especially useful for educational and government communications.

Mayura is now available as an API, allowing developers and businesses to integrate it into their applications. As part of Sarvam AI’s suite of products, it joins other tools such as Sarvam Agents, Sarvam 2B (an open-source language model), and Shuka 1.0 (an audio language model).

Picture of Siddharth Jindal

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.

Association of Data Scientists

Tailored Generative AI Training for Your Team

Upcoming Large format Conference

Sep 25-27, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

26 July 2024 | 583 Park Avenue, New York

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

September 25-27, 2024 | 📍Bangalore, India

discord icon

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Read Entire Article