Google’s Gemini Omni turns images, audio, and text into video — and that’s just the start

TechCrunch AI Rebecca Bellan May 19, 2026

AI Summary— plain English for professionals

# Google's New AI Can Turn Your Photos and Voice Into Videos Google has released a new AI tool called Gemini Omni that can create and edit videos just by talking to it — you can describe what you want, show it pictures, or play audio clips, and it will generate a video based on those inputs. Think of it like having a video editor that understands everything you throw at it (images, sound, text) and turns it into a finished product through conversation. This is the first version, called Omni Flash, with more capabilities expected down the road.

Google's Gemini Omni is a new multimodal model that reasons across text, images, audio, and video to generate and edit videos through simple conversation — starting with Omni Flash.

Read full article on TechCrunch AI

More from Latest News

View all →

AI search startups are blowing up

Agentic Programming: A Roadmap

Figma adds an AI assistant to its collaborative canvas

Get new guides every week

Real AI income strategies, tool reviews, and plain-English news — free in your inbox.

or enter email