What happened
Google unveiled Gemini Omni at I/O 2026 on May 19. Google describes it as a model that can create anything from any input, starting with video. Omni combines images, audio, video, and text as input and produces high quality video as output.
Google says Omni mixes Gemini's knowledge of history, science, and culture with a stronger understanding of physics. That includes forces like gravity, kinetic energy, and fluid dynamics, which helps generated video look more real.
Why it matters
Video generation has become one of the most watched parts of AI. Better physics understanding is a key gap in many tools, where water, cloth, and motion often look wrong. A model that handles these better is closer to professional use.
The move also puts Google directly against tools like Runway, Kling, and its own Veo line, at a time when OpenAI has stepped back from its Sora app.
MintedBrain take
For creators, the input flexibility matters most. Mixing image, audio, video, and text in one prompt opens new ways to edit and remix. The physics focus is the right bet, since realism is where most AI video still breaks down.
Discussion
Sign in to comment. Your account must be at least 1 day old.