OpenAI Unveils Sora Text-to-Video Generative AI Model

On Feb 16, 2024 at 9:09 am UTC by Mercy Tukiya Mutanya · 3 mins read

While Sora is already exhibiting superiority to most video-generation AI models in terms of video length and quality, OpenAI acknowledges that it could become better.

OpenAI has previewed a new generative AI model with text-to-video capabilities. In an announcement on Thursday, the ChatGPT creator unveiled Sora, a model that can generate high-definition clips of up to 60 seconds from a text prompt.

With this latest reveal, OpenAI joins the likes of major tech companies Google and Meta and startups like Runaway, which already have video-generating AI models.

Introducing Sora, our text-to-video model.

Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. https://t.co/7j2JN27M3W

Prompt: “Beautiful, snowy… pic.twitter.com/ruTEWn87vf

— OpenAI (@OpenAI) February 15, 2024

Sora can generate videos with multiple characters with a variety of motion types and backgrounds. This is seen in the videos on the OpenAI website. In addition to generating videos from still images, Sora can also extend videos and fill in missing frames.

“Sora has a deep understanding of language, enabling it to accurately interpret prompts and generate compelling characters that express vibrant emotions,” OpenAI writes. “The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.”

While Sora is already exhibiting superiority to most video-generation AI models in terms of video length and quality, OpenAI acknowledges that it could be better. It notes that the model may struggle “with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect”.

To illustrate the cause-and-effect challenge, the post gives an example of a scenario in which a person in a generated video may take a bite out of a cookie, after which the cookie may not have a bite mark. In addition, the model may also struggle with spatial details and reliably interpreting precise descriptions “of events that take place over time”.

In the hours following the announcement, there have been concerns raised on social media over the potential for misuse of the technology in spreading misinformation or creating harmful content. Currently, Sora is a research preview and not available for use by the general public. OpenAI states that because there are numerous ways that bad actors could exploit the technology for harm, it is still working on safeguards. The firm says it has engaged experts to test the technology against exploits and develop tools that will detect whether a video was generated using Sora.

“We’ll be taking several important safety steps ahead of making Sora available in OpenAI’s products. We are working with red teamers – domain experts in areas like misinformation, hateful content, and bias – who are adversarially testing the model,” the company writes.

OpenAI adds that it will be working with policymakers, educators and artists to hear out their concerns and identify positive use cases for Sora. It, however, points out that even with extensive research and testing, it is not possible to predict all the ways, good or bad, that people could potentially use the technology. It adds that because of this, real-world use will be vital to developing “increasingly safe AI systems over time”.

OpenAI Unveils Sora Text-to-Video Generative AI Model

Related Articles

This Altcoin Might Go Against Today’s Bear Market, ChatGPT Says

ChatGPT Suggests You Buy These 3 Cryptocurrencies Before Black Friday

Turn $1 into $1 Million: ChatGPT’s Three Crypto Picks