Sora: OpenAI's New AI Model for Creating Videos from Text

Gábor Bíró February 16, 2024
2 min read

OpenAI has unveiled Sora, a new AI model that enables users to create videos based on textual instructions, representing a significant leap forward in AI-driven content generation.

Sora: OpenAI's New AI Model for Creating Videos from Text
Source: OpenAI

OpenAI introduced Sora, a novel generative AI model capable of creating videos from text prompts. Sora utilizes a diffusion model combined with a transformer architecture, similar to the technology behind GPT models, to generate realistic and imaginative scenes. It can handle complex scenarios with multiple characters, specific types of motion, and accurate details of subjects and backgrounds. The model is also capable of animating still images, extending existing videos, or filling in missing frames, producing videos up to one minute long in various styles, including photorealistic, animated, or black-and-white.

Despite its impressive capabilities, Sora currently has limitations. It struggles with accurately simulating the physics of complex scenes, understanding cause-and-effect relationships, and maintaining precise spatial details over time. For example, a character might bite a cookie, but the cookie may not show a bite mark afterward, or the model might confuse left and right directions within a scene.

OpenAI is exercising caution before making Sora widely available. They are actively working with red teamers (experts who test systems for flaws) to assess potential harms and risks, such as the generation of misinformation, hateful content, or bias. Furthermore, detection classifiers are being developed to identify misleading content generated by Sora, and the company plans to include C2PA metadata in the future to ensure the provenance of Sora-generated videos.

Currently, Sora is available to a limited group of red teamers and a select number of visual artists, designers, and filmmakers to gather feedback on how to make the model most helpful for creative professionals. OpenAI is engaging with policymakers, educators, and artists globally to understand concerns and identify positive use cases for this technology. They emphasize that learning from real-world use is crucial for creating and releasing increasingly safe AI systems over time.

Sora's introduction follows OpenAI's pattern of rapidly developing advanced generative AI tools, including ChatGPT for text and DALL-E 3 for images. Sora marks a significant advancement in the capabilities of AI for video content generation, further accelerating competition and innovation in this rapidly evolving field.

Gábor Bíró February 16, 2024