Nothing's in my cart
6-minute read
Since the beginning of this year, OpenAI has been teasing the community with AI-generated videos made using Sora, sparking curiosity among creators and tech enthusiasts worldwide. Many have been eager to know how to use Sora and when it would be available. Finally, during the "Shipmas" product launch, Sora made its debut. The new "Sora Turbo" model builds on the DALL·E and GPT series, expanding generative AI from images and text to video generation. It can create videos from prompts, blend different clips, and even craft each frame using a storyboard. Sora is now available to ChatGPT subscribers in the US and most countries.
It's worth noting that some users find the $200 monthly fee for ChatGPT Pro a bit steep. However, if you're just looking to "test the waters," the $20 monthly ChatGPT Plus plan also offers access to Sora, albeit with limited features (video resolution capped at 720p and a maximum length of 5 seconds). At least you can get a taste of Sora's capabilities before deciding to upgrade.
Is Sora OpenAI's Christmas gift to everyone? (Source: OpenAI)
Sora is a video generation tool based on a "diffusion model." This model gradually transforms "noisy" videos into clear final products, considering multiple frames simultaneously to ensure consistency of subjects and scenes even when the camera moves or objects disappear. This feature addresses issues like character or object distortion and scene discontinuity that traditional models often face.
Sora draws from the success of DALL·E and GPT models, using a technique called "recaptioning" to generate highly descriptive text labels for training materials, allowing the model to better understand and respond to user commands. With this technology, users can generate videos simply by describing scenes, actions, and styles in text. Additionally, Sora supports animating static images, extending existing video frames, and even filling in missing segments. These technologies mark a significant milestone in OpenAI's journey towards Artificial General Intelligence (AGI).
Many are curious about the sources of Sora's training data, and OpenAI has provided some clarity. Sora's training data comes from diverse sources, including publicly available video and image datasets, proprietary materials from partners like Shutterstock and Pond5, and custom datasets commissioned and created by OpenAI. They claim all content is obtained under legal authorization and strict usage terms.
OpenAI not only relies on professional reviewers but also has a dedicated "red team" for rigorous testing. Red team members challenge the model from an "attacker" or "skeptic" perspective, posing tricky or potentially inappropriate commands to identify weaknesses and provide feedback. This dual-layered scrutiny by trainers and the red team continually enhances Sora's generation quality and safety, aiming for stable, reliable, and ethical performance in real-world scenarios.
However, in a previous interview, OpenAI's former CTO Mira Murati did not provide a clear answer to the sensitive question of whether training data includes videos from platforms like YouTube and Instagram Reels (even appearing a bit tense), only emphasizing that Sora is based on publicly available materials. Coincidentally, renowned tech reviewer MKBHD noted in his latest Sora review video that when he entered keywords related to himself, Sora generated videos featuring elements similar to the "fake plant" seen in his channel videos. This has led to speculation that any publicly released video could potentially be part of the model's learning process.
Let's find the Easter eggs together. (Source: MKBHD)
Sora not only generates videos from text commands but also offers various editing options. Here are the main features:
The essence of the Remix feature is to gradually change elements, styles, and scenes in a video through a series of text commands. You don't need to start from scratch; you can continuously "tweak" or "transform" the video content based on the existing foundation. For example, the official examples include:
Similar to ChatGPT's previous image editing tweaks, but for videos. (Source: OpenAI)
The Re-cut feature allows you to select specific frames in the generated video and extend or focus on these segments.
The Storyboard feature is arguably the most exciting release. It allows you to arrange video content frame by frame on a timeline, setting specific scenes, styles, or actions for each point in time. This way, you can clearly grasp the overall narrative structure before generating the video, making it ideal for creating ads, short films, and various other works.
Create seamless looping videos of swaying tree shadows, burning campfires, or ocean waves, perfect for making background animations or GIF-style materials.
The Blend feature merges two videos into a smooth segment. If you have a clip shot in a forest and another on a coastline, Blend can transition from the forest to the coast naturally, making it look seamless.
Style presets combine complex commands into simple one-click solutions. If you want a realistic and vintage look, choose the "Archival" style; for a mysterious black-and-white suspense atmosphere, go for the "Film Noir" style.
There's also this cardboard art style. (Source: OpenAI)
Sora's interface is quite intuitive, making it easy for beginners to get started.
In the input box at the bottom of the Sora Video Editor, describe the desired video content, scene, or atmosphere in text, or upload legally authorized images or video materials. Whether it's a wild creative idea or ready-made materials, they can all be starting points for video production.
Before submitting for generation, decide on the video's aspect ratio, resolution, length, and how many different versions (variations) you want to produce. These settings will affect the consumption of credits, so consider your needs and costs before confirming.
After clicking submit, the system usually generates the video within tens of seconds to a minute. Once completed, if you chose to generate multiple versions simultaneously, you can browse all versions in the "Library" and select the one closest to your ideal product.
Each generated video can be downloaded in MP4 format or shared via a link. If you have the higher-tier ChatGPT Pro subscription, the downloaded files will be watermark-free, allowing for broader use.
Sora's "Explore" page is like a short video platform where you can browse others' public creations and see their commands and used features.
Videos generated by Sora are automatically public on the Explore page by default. If you don't want your creations to be immediately exposed, you can disable the "Publish to explore" option in the settings.
Be sure to use legally authorized materials and avoid uploading others' portraits. Violating content can lead to legal issues and potentially inappropriate deepfake videos.
Don't use Sora for malicious purposes! Sora automatically adds digital watermarks and C2PA-compliant metadata to ensure video traceability. The system strictly prohibits creating illegal, harmful, misleading, and sexually exploitative or inappropriate content involving children.
Currently, ChatGPT Free, Enterprise, and Edu accounts cannot use Sora. Here's a comparison of ChatGPT Plus and Pro:
Subscription Plan | Price (Monthly) | Video Quota and Credits | Resolution and Length Limit | Watermark |
---|---|---|---|---|
ChatGPT Plus | $20 | Up to 50 high-priority videos per month (1,000 credits) | Up to 720p, 5 seconds. | Yes |
ChatGPT Pro | $200 | Up to 500 high-priority videos per month (10,000 credits) | Up to 1080p, 20 seconds, can generate 5 videos simultaneously. | No |