For several days now, the Internet has been busy dissecting the new and revolutionary prodigy born from the depths of OpenAI : the AI tool for generating videos from "prompts" Sora .
ChatGPT's new invention is an AI model capable of creating videos up to a minute long without using any base image and using only text prompts.
To demonstrate the phenomenal virtues of Sora, Sam Altman , CEO of OpenAI, took a handful of user suggestions on X (formerly Twitter) last jamaica number screening Thursday and quickly metamorphosed them into videos that were as realistic as they were rich in detail. It took good old Sam Altman at least 20 minutes to transform the "prompts" suggested by X users into videos.
It is worth keeping in mind, however, that Sora is an experimental AI model and that the speed it shows now will not necessarily be the same as the speed it will show in the future, when it is available to the general public.
Sora is a system capable of simulating aspects of the physical world in motion thanks to its architecture , which amalgamates diffusion technology with a transformer-based engine.
When generating moving images, Sora sources videos in their original resolution and then divides them into smaller sections, or visual patches . To achieve this process, the videos are simplified in the so-called latent space, where the clips used by Sora as raw material are compressed both temporally and spatially.