ByteDance Unveils Goku to Take on Google’s Luma and OpenAI’s Sora

2 months ago 26
  • Published on February 12, 2025
  • In AI News

While the models may be less powerful than the anime character Goku, they do look pretty impressive.

ByteDance, the parent company of TikTok, has dropped a family of joint image-video generation models called Goku. The models seem to be named after the popular anime character ‘Goku’ from the Dragon Ball series. 

This comes right after the company teased a video AI model that generates videos from images, dubbed OmniHuman-1.

Researchers claim that the Goku models help create product videos featuring AI-generated influencers, marketing avatars, landscape demos, visualising Chinese poetry, portrait video demos, and more.

The research paper attributes the model’s ability to generate high-quality videos to several key factors. One is the implementation of a rectified flow (RF) formulation for joint image and video generation and the employment of a 3D joint image-video VAE to compress inputs into a shared latent space. 

Moreover, the architecture features a Transformer network with full attention, enhanced with techniques like FlashAttention, sequence parallelism, Patch n’ Pack, 3D RoPE position embedding, and Q-K normalisation.

The paper also states that the Goku models demonstrate superior performance in both qualitative and quantitative evaluations, setting new benchmarks when compared to competitors like Luma, Open-Sora, Mira, and Pika.

Goku achieved 0.76 on GenEval, 83.65 on DPG-Bench for text-to-image generation, and 84.85 on VBench for text-to-video tasks. You can see the benchmark results below. 

(Credit: GitHub page)

“We believe that this work provides valuable insights and practical advancements for the research community in developing joint image-and-video generation models,” the researchers said.

The model’s ability to generate high-quality product videos featuring AI-generated influencers and other realistic visuals could hugely benefit content creators, influencers, marketers, and others.

Picture of Ankush Das

Ankush Das

I am a tech aficionado and a computer science graduate with a keen interest in AI, Open Source, and Cybersecurity.

Association of Data Scientists

GenAI Corporate Training Programs

India's Biggest Women in Tech Summit

Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Rising 2025 | DE&I in Tech & AI

Mar 20 and 21, 2025 | 📍 J N Tata Auditorium, Bengaluru

AI Startups Conference.
April 25, 2025 | 📍 Hotel Radisson Blue, Bangalore, India

Data Engineering Summit 2025

15-16 May, 2025 | 📍 Taj Yeshwantpur, Bengaluru, India

MachineCon GCC Summit 2025

19-20th June 2025 | 📍 ITC Grand, Goa

17-19 September, 2025 | 📍KTPO, Whitefield, Bangalore, India

India's Biggest Developers Summit Nimhans Convention Center, Bangalore

discord icon

Our Discord Community for AI Ecosystem.

Read Entire Article