In brief

  • Google introduced Gemini Omni at I/O 2026 as a multimodal AI model designed to generate video and other media from nearly any input.
  • DeepMind CEO Demis Hassabis said Gemini Omni combines Gemini with media-generation models including Veo, Nano Banana, and Genie.
  • Gemini Omni Flash is launching first through Flow and Flow Music for Google AI subscribers.

Google on Tuesday introduced Gemini Omni, a new multimodal AI model that combines the company’s Gemini AI models with its media-generation tools, including Veo, Nano Banana, and Genie.

The announcement came during Google I/O 2026, where DeepMind CEO Demis Hassabis described Gemini Omni as “our new model that can create anything from any input.”

“It combines Gemini's intelligence with the best of our generative media models for a new level of world understanding, multimodality, and editing,” Hassabis said.

Google said the first release, Gemini Omni Flash, will launch through Flow, the company’s AI filmmaking platform, and Flow Music, which focuses on AI-assisted music creation.

Calling Omni a “step towards artificial general intelligence,” Hassabis said Google has spent the past year extending Gemini into “a world model AI that can understand and simulate the world.”

Google’s Omni rollout builds on the popularity of Nano Banana, the company’s earlier AI image-editing model that helped push Gemini to the top of Apple’s App Store last September. Nano Banana became widely used for meme generation and conversational image editing, briefly helping Gemini overtake ChatGPT in app downloads and Google search interest for the first time since OpenAI’s chatbot launched in 2022.

In Decrypt’s comparison earlier this month, Nano Banana 2 outperformed OpenAI’s GPT Image 2 in anime illustration and spatial composition tests, while OpenAI’s model performed better with photorealism and text rendering. Google now appears to be extending many of those editing features into video through Gemini Omni.

During the presentation, Google demonstrated Omni generating a claymation-style educational video explaining protein folding. The company also showed conversational editing tools that modified a selfie video by adding new visual elements and changing the surrounding environment.

Google says Omni can keep the same characters, backgrounds, and movement consistent even after users make changes to a video—something many AI video models struggle with. The company also says Omni uses Gemini’s reasoning abilities to understand broader instructions, so users can describe the kind of scene they want without manually explaining every detail.

The company also introduced Flow Agent, an AI assistant integrated into Google Flow that can brainstorm scenes, organize assets, recommend plot changes, and batch-edit projects.

Additional updates include Flow Tools, which allows users to create custom editing workflows using natural-language prompts without coding experience.

Hassabis said Google is starting with video generation, but plans to expand access to Omni, describing it as the long-term vision behind Gemini’s multimodal design.

“This was always our goal with Gemini, and why we built it to be multimodal from the very start,” he said.

Google did not immediately respond to a request for comment by Decrypt.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.