OpenAI发布首个视频生成模型Sora

CGTNBizTalk

CGTN栏目《BizTalk》官方账号 2024.02.1620:55

关注

2月16日，OpenAI在其官网发布文生视频模型Sora。据介绍，该模型可以生成长达一分钟的视频，同时保持视觉品质并遵循用户提示。

该公司表示Sora能够生成复杂的场景，不仅包括多个角色，还有特定的动作类型，以及对对象和背景的准确细节描绘。除此之外，Sora还可以将静态图像制作成动画。

目前Sora仍在开发中，该公司承认该模型可能会混淆提示的空间细节，例如混淆左右，并且难以精确描述随着时间推移发生的事件，例如遵循特定的相机轨迹。

对于Sora的横空出世，有网友惊叹智能技术的发展，也有网友担心Sora未来会取代某些职业， “这有点过了，我可能会丢掉工作。”

Footage generated by Sora: a stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. /OpenAI

Microsoft-backed OpenAI is developing an AI model capable of generating minute-long videos based on text prompts, the company announced on Thursday.

The model, named "Sora" after the Japanese word for "sky," is currently available for red teaming, which helps identify flaws in the AI system. Additionally, it is intended for use by visual artists, designers, and filmmakers to provide feedback on the model, the company stated.

"Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background," the statement said, adding that it can create multiple shots within a single video.

In addition to generating videos from text prompts, Sora can also animate a still image, as mentioned in a blog post by the company.

Animated scene generated by Sora: a close-up of a short fluffy monster kneeling beside a melting red candle. /OpenAI

The video generation software follows OpenAI's ChatGPT chatbot, which was released in late 2022 and created a buzz around generative AI with its ability to compose emails and write codes and poems.

Social media giant Meta Platforms beefed up its image generation model Emu last year to add two AI-based features that can edit and generate videos from text prompts. The Facebook-parent company is also looking to compete with Microsoft, Alphabet's Google and Amazon in the rapidly transforming generative AI universe.

Sora is still a work-in-progress, with the company acknowledging that the model may sometimes struggle with spatial details in a prompt and encounter difficulties in following a specific camera trajectory.

OpenAI also mentioned that they are developing tools to determine whether a video was generated by Sora.

Footage generated by Sora: drone view of waves crashing against the rugged cliffs along Big Sur's garay point beach. /OpenAI

The new tool is not yet publicly available, and OpenAI has disclosed limited information about its development process. The company, which has faced lawsuits from some authors and The New York Times over its use of copyrighted works to train ChatGPT, has not revealed the imagery and video sources used to train Sora.

OpenAI mentioned in a blog post that it is consulting with artists, policymakers and other stakeholders before releasing the new tool to the public.

"We are working with red teamers – domain experts in areas like misinformation, hateful content, and bias – who will be adversarially testing the model," the company said. "We're also building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora."