5 min read
Runway, the AI company known for its popular generative video tool, has unveiled its latest iteration, Runway Gen-3. The new model, which is still in alpha and not publicly available, was showcased through a series of sample videos that appeared to show a significant leap forward in coherence, realism, and prompt adherence when compared to the currently available Gen-2.
The generated videos, particularly those featuring human faces, are highly realistic—so much that AI art community members quickly compared it favorably against OpenAI's yet-to-be-released but highly anticipated Sora.
“Even if these are cherry-picked, they already look better than Sora," one Reddit user wrote in the top-voted comment in the Runway Gen-3 discussion thread. “Sora has a stylized look and feel to it,” another user replied, “These people look actually real, the best I've seen so far.”
“If you showed those generated people to me I'd have assumed it was real,” read another comment on the 66,000-member AI Video subreddit.
Image: Runway AI
“These Runway GEN-3 clips really hold a visual appeal to me—they look cinematic,” tweeted pseudonymous AI filmmaker PZF, who also lists himself as a creative partner of Runway. “Smooth, understated (in a good, naturalistic way), believable.”
Alongside the Gen-3 video generator, Runway is also introducing a suite of fine-tuning tools, including more flexible image and camera controls. "Runway’s multi-modal platform already supports text-to-image generation – trained jointly on videos and images. Gen-3 Alpha will power Runway's Text to Video, Image to Video and Text to Image tools." Anastasis Germanidis, cofounder and CTO of the Runway AI told Decrypt.
Runway claims that Gen-3 is a significant step towards realizing their ambitious goal of creating "General World Models." These models would enable an AI system to build an internal representation of an environment and use it to simulate future events within that environment. This approach would set Runway apart from conventional techniques that focus on predicting the next likely frame in a specific timeline.
While Runway has not revealed a specific release date for Gen-3, Germanidis, cofounder assured that Gen-3 Alpha will be released very soon. "Gen-3 Alpha will be available in the coming days, available to paid Runway subscribers, our Creative Partners Program, and Enterprise users." he told Decrypt.
Runway's journey in the AI space began in 2021 when they collaborated with researchers at the University of Munich to build the first version of Stable Diffusion. Stability AI later stepped in to offset the project’s computing costs and turned it into a global phenomenon.
Since then, Runway has been a significant player in the AI video generation space, alongside competitors like Pika Labs. However, the landscape shifted with OpenAI's announcement of Sora, which surpassed the capabilities of existing models. Hollywood actor Ashton Kutcher recently caused a stir when he said tools like Sora could massively disrupt TV and film production.
As the world waits for Sora's public release, however, new competitors have emerged, such as Kuaishou's Kling and Luma AI's Dream Machine.
Kling, a Chinese video generator, can produce videos up to two minutes long in 1080p resolution at 30 frames per second, a substantial improvement over existing models. This Chinese model is already available, but users need to provide a Chinese phone number. Kuaishou said it will release a global version.
Dream Machine, on the other hand, is a free-to-use platform that converts written text into dynamic videos and also provides results that easily beat Runway Gen-2 in terms of quality, coherence, and prompt adherence. It requires a basic Google account, but it has been so popular that generations take extremely long to appear—if they appear at all.
In the open-source realm, Stable Video Diffusion, while not capable of producing comparable results, offers a solid foundation for improvement and development. Vidu, another Chinese AI video generator developed by ShengShu Technology and Tsinghua University, uses a proprietary visual transformation model architecture called the Universal Vision Transformer (U-ViT) to generate 16-second videos in 1080p resolution with a single click.
As for Pika Labs, it has not released a major update, leaving its capabilities comparable to Runway Gen-2.
Edited by Ryan Ozawa.
Decrypt-a-cookie
This website or its third-party tools use cookies. Cookie policy By clicking the accept button, you agree to the use of cookies.