TikTok’s Parent Teases Video AI Model Rivaling OpenAI’s Sora, Turns Photos into Videos

2 months ago 28

Published on February 7, 2025
In AI News

ByteDance dominates the short-video segment with TikTok. Will it be a leading GenAI company as well?

With DeepSeek becoming the world’s leading app in no time, ByteDance, the company behind TikTok, has now released a research paper on its new video generation AI model, OmniHuman-1.

The OmniHuman-1 model can generate realistic human videos by employing a mixed data training strategy with multi-modality motion conditioning.

In the research paper, the authors mention, “We propose OmniHuman, an end-to-end multimodality-conditioned human video generation framework that generates human videos based on a single image and motion signals (e.g., audio, video, or both).” The researchers who worked on it include Gaojie Lin, Jianwen Jiang, Jiaqi Yang, Zerong Zheng, and Chao Liang.

The model relies on omni-conditions training, which ensures that it does not waste data while transferring data from weaker-conditioned tasks to stronger-conditioned tasks.

ByteDance’s creation joins the race with Google’s Lumiere, OpenAI’s Sora, and other text-to-video generation models. Fundamentally, they are different from one another, but they could take the internet by storm, just like OpenAI’s Sora. There have been no studies comparing the popular models yet.

Here’s how it looks in action:

In other words, one can generate a video based on a single image. While that is exciting, it is scary at the same time, considering deepfake creations are already succeeding in extorting money from senior citizens.

Anshuman Jha, an AI consultant at AON, took to LinkedIn to highlight potential abuse from using such a model. “From entertainment to advertising, the applications are limitless. Imagine personalised ads where celebrities endorse products in real-time or deceased artists perform new songs. The potential for misuse is glaring,” he said. On the other hand, Jha also mentioned it as a “marvel”.

At the moment, the model is not available to the public. However, the results shared through the official website mention that the model works on any kind of image.

A Reddit discussion on OmniHuman-1 agrees that it can be a game-changer in AI-based video generation models. There is a buzz about it on social media platforms, and everyone seems surprised at the accuracy of the results.

Similar to how DeepSeek recently dominated everything up until today, OmniHuman-1 could be the next talk of the town in video generation AI models.