After OpenAI’s Sora, Google’s Veo is a world builder.

Illustration by Nalini Nirad
Just last week, OpenAI CEO Sam Altman declared that video is key to achieving AGI, with its launch of Sora. Now, Google strikes back with Veo 2 and Imagen 3, its latest generative AI models. Though not publicly available yet, product lead Logan Kilpatrick revealed they will hit API by early next year.
The model handles complex elements like reflections and shadows, producing clearer, sharper footage. It also includes SynthID watermarking for safety.
Google’s internal testing indicates that Veo outperforms competitors (such as China’s Kling, Meta’s Moviegen, and OpenAI’s Sora) both in terms of quality and prompt adherence.
Justine Moore, a partner at a16z, was among the early testers and noted that the model excelled at creating nature and animal-related clips, as well as capturing detailed movement.
The model builds on its predecessor, first showcased at Google I/O in May, and has since been integrated into YouTube and Google Cloud.
Google’s Veo 2 is presented as being more advanced in terms of cinematic understanding. “Veo 2 delivers lifelike visuals with enhanced realism, reducing artifacts and improving detail. Its motion simulation accurately replicates simple and complex movements using physics,” said Google DeepMind’s Tom Hume on X.
“It’s not perfect, but it’s a significant improvement over current state-of-the-art models, as our benchmarks demonstrate,” said Shlomi Fruchter, Veo co-lead at Google DeepMind, hinting how the model still struggles with complex physics.
Wharton’s Ethan Mollick said, in comparison to other models, “Sora offers a lot more control options and longer clips, so it is hard to compare, but I will say that I think the dominance of the Chinese models is over.” Interestingly, according to the blog, Google claims Kling is its biggest competitor.
Does Veo Pass the Physics Test?
Google’s access to YouTube gives it a clear advantage over OpenAI for training these models to maintain the laws of physics.
Veo 2’s true test lies in generating a gymnast’s routine, showcasing its improved grasp of human movement while accurately modelling complex motions. In a viral tweet shared by VC Deedy Das, Sora failed on this prompt.
Veo 2 supports 4K resolution and can produce videos longer than two minutes, although it’s currently restricted to 720p and eight seconds on its experimental platform. Notably, it outperforms Sora with four times the resolution and six times the video duration.
This release follows another significant development by DeepMind in the GenAI space: the launch of Genie 2, a foundation world model capable of generating interactive 3D environments from simple text prompts.
World models like Genie 2 provide a vast and diverse set of environments that are critical for the training of embodied AI agents. These environments act as test beds for agents, enabling them to generalise across various domains and prepare them for more complex, real-world tasks. This research accelerates the pace of DeepMind’s AGI vision.
This solidifies 2025 as the year of advanced world models, with Google at its helm.
Road to AGI
Google’s 2014 acquisition of DeepMind for $400-650 million is touted as one of the smartest business decisions in history. To this, Tesla chief Elon Musk quips: “You have that backwards. DeepMind acquired Google,” highlighting how AI is essential to Google’s relevance today, especially in the race to AGI.
In a previous interview with AIM, AI sceptic Gary Marcus said that DeepMind is likely on a better path towards AGI compared to its competitors.
Google’s announcements this month have played a key role in challenging OpenAI during its 12-day ‘shipmas.’ However, as more companies introduce capabilities, often at lower price points, OpenAI’s $200 pricing might face increasing scrutiny.
There is no stopping Google: The scale at which the tech giant is shipping is synonymous with its early days, like a startup. Releases include Gemini 2, Willow, GenCast, along with updates to NotebookLM, to name a few.
Aditi Suresh
Aditi is a political science graduate, and is interested in technology, AI, social media, and online culture.
Subscribe to The Belamy: Our Weekly Newsletter
Biggest AI stories, delivered to your inbox every week.
February 5 – 7, 2025 | Nimhans Convention Center, Bangalore
Rising 2025 | DE&I in Tech & AI
Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru
Data Engineering Summit 2025
May, 2025 | 📍 Bangalore, India
MachineCon GCC Summit 2025
June 2025 | 583 Park Avenue, New York
September, 2025 | 📍Bangalore, India
MachineCon GCC Summit 2025
The Most Powerful GCC Summit of the year
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.