Google Launches Gemini 2.0 Making the Age of AI Agents a Reality

4 months ago 37
  • Published on December 11, 2024
  • In AI News

2025 will be the year of AI agents and Gemini 2.0 will be the generation of models that underpin our agent-based work.’

As predicted by AIM, Google has finally launched Gemini 2.0, its next-generation AI model, built to redefine multimodal capabilities and introduce agentic functionalities. 

“Today we’re excited to launch our next era of models built for this new agentic era: introducing Gemini 2.0, our most capable model yet. With new advances in multimodality — like native image and audio output — and native tool use, it will enable us to build new AI agents that bring us closer to our vision of a universal assistant,” Google said in the blog post.

“This is really just the beginning. 2025 will be the year of AI agents and Gemini 2.0 will be the generation of models that underpin our agent-based work,” said Google DeepMind chief Demis Hassabis.

Gemini 2.0 Flash supports multimodal inputs, including images, video, and audio, as well as multimodal outputs such as natively generated images combined with text and steerable text-to-speech (TTS) multilingual audio. It can also natively call tools like Google Search, execute code, and integrate third-party user-defined functions.

The Gemini 2.0 Flash model offers faster response times and outperforms its predecessors on major benchmarks. Developers can access Gemini 2.0 Flash through Google AI Studio and Vertex AI, with general availability expected by January 2025. 

Google has also launched the Multimodal Live API, bringing real-time audio and video input capabilities that allow developers to create dynamic, interactive applications.

Project Astra and AI Agents 

Introduced at Google I/O 2024, Google Project Astra, a universal AI assistant, has received several updates. It now supports multilingual and mixed-language conversations, with an improved understanding of accents and uncommon words. 

Powered by Gemini 2.0, Project Astra can also utilise Google Search, Lens, and Maps, making it a more practical assistant for daily tasks. Its memory has been enhanced, allowing up to 10 minutes of in-session recall and better personalisation through past interactions. Additionally, improved streaming and native audio processing reduce latency, enabling near-human conversational speeds.

Google has also announced an early-stage research prototype, Project Mariner, which will understand and reason based on information that can be accessed while a user navigates on a web browser. 

Google says that the agent uses information it sees on the screen through a Google Chrome extension to complete related tasks. The agent will be able to read information, like text, code, images, forms and even voice-based instructions. 

Book a flight from SF to Berlin, departing on March 5 and returning on the 12. The era of being able to give a computer a fairly complex high-level task and have it go off and do a lot of the work for you is becoming a reality,” said Jeff Dean, chief scientist at Google DeepMind. 

Google has also introduced Jules, a developer-focused agent, that integrates with GitHub workflows to assist with coding tasks under supervision.

Google DeepMind is working on AI agents that improve video games and navigate 3D worlds. It has partnered with game developers like Supercell to explore the future of AI-powered gaming companions. Gemini 2.0’s spatial reasoning is also being tested in robotics for practical real-world use. 

Notably, it recently launched Genie 2, a large-scale foundation world model capable of generating a wide variety of playable 3D environments. 

Picture of Siddharth Jindal

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.

Association of Data Scientists

GenAI Corporate Training Programs

India's Biggest Developers Summit

February 5 – 7, 2025 | Nimhans Convention Center, Bangalore

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

February 5 – 7, 2025 | Nimhans Convention Center, Bangalore

Rising 2025 | DE&I in Tech & AI

Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru

Data Engineering Summit 2025

May, 2025 | 📍 Bangalore, India

MachineCon GCC Summit 2025

June 2025 | 583 Park Avenue, New York

September, 2025 | 📍Bangalore, India

MachineCon GCC Summit 2025

The Most Powerful GCC Summit of the year

discord icon

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Read Entire Article