Google DeepMind Unveils Inference Time Scaling for Diffusion Models

3 months ago 36
  • Published on January 17, 2025
  • In AI News

‘Inference-time scaling for LLMs drastically improves the model's ability in many perspectives, but what about diffusion models?’

Google DeepMind, the AI research arm of Google, in collaboration with the Massachusetts Institute of Technology (MIT) and New York University (NYU), has published a new study that introduces inference time scaling for diffusion models. 

The research titled ‘Inference-Time Scaling for Diffusion Models Beyond Scaling Denoising Steps’ explores the impact of providing additional computing resources to image generation models while they generate results. 

Diffusion models begin the process of ‘pure noise’ and require multiple steps of denoising to obtain clean outputs based on the input. “In this work, we explore the inference-time scaling behaviour of diffusion models beyond increasing denoising steps and investigating how the generation performance can further improve with increased computation,” the authors said. 

The research found that increasing inference time compute leads to ‘substantial improvements’ in the quality of the samples generated. Check out the detailed technical report of the research to understand the nitty-gritty details of the components and the techniques used.

One of the researchers, Nanye Ma, said that the research found improvements when better-starting noise is searched for. “This suggests pushing the inference-time scaling limit by investing compute in searching for better noises,” he said on X

“Our search framework consists of two components: verifiers to provide feedback and algorithms to find better noise candidates,” he added. 

The research compared the effectiveness of inference-time search methods across different models and showed that small models with search can outperform larger ones without search.

“These results indicate that substantial training costs can be partially offset by modest inference-time compute, enabling higher-quality samples more efficiently,” said Ma. 

Inference time compute is a concept that has been widely used in large language models, especially in OpenAI’s o1 reasoning model.

“By allocating more compute during inference, often through sophisticated search processes, these works show that LLMs can produce higher-quality and more contextually appropriate responses,” said the authors of the paper, indicating their motivation to apply these techniques to diffusion models. 

As demonstrated by Google DeepMind and others, this seems to hold true for diffusion models as well. Saining Xie, one of the authors, said that he was blown away by diffusion models’ natural ability to scale during inference. “You train them with fixed flops, but during test time, you can ramp it up by [around] 1,000 times,” he said on X.

While the research mostly focuses on image generation tasks, and evaluates them on text-to-image generation benchmarks, it will be hard for OpenAI to beat Google if these techniques can extend to video generation as well. Google’s Veo 2 model outperforms OpenAI’s Sora both in terms of quality and prompt adherence. 

Picture of Supreeth Koundinya

Supreeth Koundinya

Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.

Association of Data Scientists

GenAI Corporate Training Programs

India's Biggest Developers Summit

February 5 – 7, 2025 | Nimhans Convention Center, Bangalore

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

February 5 – 7, 2025 | Nimhans Convention Center, Bangalore

Rising 2025 | DE&I in Tech & AI

Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru

Data Engineering Summit 2025

15-16 May, 2025 | 📍 Taj Yeshwantpur, Bengaluru, India

AI Startups Conference.
April 25 / Hotel Radisson Blu / Bangalore, India

17-19 September, 2025 | 📍KTPO, Whitefield, Bangalore, India

MachineCon GCC Summit 2025

19-20th June 2025 | Bangalore

discord icon

Our Discord Community for AI Ecosystem.

Read Entire Article