- Last updated September 5, 2024
- In AI News
Developers and businesses can LLaVA v1.5 7B in Preview Mode on GroqCloud.

Illustration by Raghavendra Rao
Groq has introduced LLaVA v1.5 7B, a new visual model now available on its Developer Console. This launch makes GroqCloud multimodal and broadens its support to include image, audio, and text modalities.
🚨 New *multi-modal* model dropped on @Groqinc! Llava v1.5 7b is a visual model that can take images as input.
⚡️Try it now in API or console as “llava-v1.5-7b-4096-preview”!
Developers can now build applications on Groq with all three modalities: image, audio, and text! pic.twitter.com/px90CVtPLq
LLaVA, short for Large Language and Vision Assistant, combines language and vision capabilities. It builds on OpenAI’s CLIP and Meta’s Llama 2 7B model, utilising visual instruction tuning to enhance image-based natural instruction following and visual reasoning.
This enables LLaVA to excel in tasks such as visual question answering, caption generation, optical character recognition, and multimodal dialogue.
“LLaVA-v1.5-7B which supports vision/image inputs and in our initial benchmarking response times were >4X faster than GPT-4o on OpenAI,” said Artificial Analysis.
Groq has launched their first multi-modal endpoint! Groq is hosting LLaVA-v1.5-7B which supports vision/image inputs and in our initial benchmarking response times were >4X faster than GPT-4o on OpenAI.
We have conducted initial benchmarking comparing the response speed of… pic.twitter.com/bHFDSeVPaZ
The new model unlocks numerous practical applications. Retailers can use it for inventory tracking, social media platforms can improve accessibility with image descriptions, and customer service chatbots can handle text and image-based interactions.
Additionally, it aids in automating tasks in industries such as manufacturing, finance, retail, and education by streamlining processes and enhancing efficiency.
Developers and businesses can LLaVA v1.5 7B in Preview Mode on GroqCloud.
Groq recently partnered with Meta, making the latest Llama 3.1 models—including 405B Instruct, 70B Instruct, and 8B Instruct—available to the community at Groq speed.
Former OpenAI researcher Andrej Karpathy praised Groq’s inference speed, saying, “This is so cool. It feels like AGI—you just talk to your computer and it does stuff instantly. Speed really makes AI so much more pleasing.”
Founded in 2016 by Ross, Groq distinguishes itself by eschewing GPUs in favour of its proprietary hardware, the LPU.
Siddharth Jindal
Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.
Subscribe to The Belamy: Our Weekly Newsletter
Biggest AI stories, delivered to your inbox every week.
Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
September 25-27, 2024 | 📍Bangalore, India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.