- Last updated November 26, 2024
- In AI News
"For instance, Fugatto can make a trumpet bark or a saxophone meow. Whatever users can describe, the model can create."

NVIDIA has announced a new text-to-audio model called Fugatto, which can create ‘any combination of music, videos and sounds’. Fugatto, which stands for ‘Foundational Generative Audio Transformer Opus,’ can generate audio that is a mixture of music, voices, and sounds, as described by the prompt.
“For example, it can create a music snippet based on a text prompt, remove or add instruments from an existing song, change the accent or emotion in a voice — even let people produce sounds never heard before,” said NVIDIA, in the announcement.
NVIDIA also states that it is the first generative AI model to show ‘emergent properties,’ meaning its capabilities are derived from the interaction within the training properties.
“During inference, the model uses a technique called ComposableART to combine instructions that were only seen separately during training. For example, a combination of prompts could ask for text spoken with a sad feeling in a French accent,” said NVIDIA in the announcement.
They’re positioning this tool for audio producers, video game developers, language learning tools, and ad agencies.
Fugatto is trained on 2.5 billion parameters and built on top of NVIDIA DGX systems, with 32 NVIDIA H100 GPUs. While NVIDIA hasn’t released the model for use, it has released a demo on GitHub and a paper.
The GitHub demo dives deeper into the tool’s capabilities. It showcases examples of sound mixtures like ‘violin melody and baby laugh’, a synthesis of ‘a cello shouting with anger and a cello screaming’, and other amusing examples, like making a ‘trumpet bark’ or a ‘saxophone meow’.
The paper released by NVIDIA also lists all the open-source vocal and non-vocal datasets, including a few from BBC, and Epidemic Sounds.
“We work towards a future where unsupervised multitask learning in audio synthesis and transformation emerges from data and model scale. Our proposed framework ComposableART Fugatto establishes our first step towards this direction,” mentioned the authors in the paper.
When Fugatto is eventually released, it will compete with ElevenLabs and SunoAI. That said, none of the existing tools can generate sounds outside their training database.
(Total 3 views)
Supreeth Koundinya
Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.
Subscribe to The Belamy: Our Weekly Newsletter
Biggest AI stories, delivered to your inbox every week.
Rising 2025 | DE&I in Tech & AI Summit
Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
February 5 – 7, 2025 | Nimhans Convention Center, Bangalore
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
September 25-27, 2024 | 📍Bangalore, India
25 July 2025 | 583 Park Avenue, New York
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.
AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.
AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.
ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent
With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.
Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring
AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.
© Analytics India Magazine Pvt Ltd & AIM Media House LLC 2024