May 2026
4 min read
3 scenes, 6 transformations
An hour with Gemini Omni.
I shot three scenes on my phone. A pan across the inside of my car. Me in the driver's seat, turning the camera. A finger touching a mirror. Then I spent forty-five minutes in Google's Flow turning them into other things. Here is what came back, and the bit that actually felt new.
The headline is not the image quality. Generating from zero, Omni feels on par with Veo, maybe a touch better. The real thing here is the control. Precise edits to footage I already had. A scene I shot, asking the model to keep the camera move and the geometry while it swapped out the world. That feeling is what I want to walk through.
I tried it on Flow. The Higgsfields of the world have so many sliders my eyes bleed. Flow is just a brief, a clip, a button. That's what I want from an interface in 2026.
What it actually feels like.
The phrase I keep landing on is nano banana for video. Google have said something close to this themselves. It means.. you bring a real thing and you ask the model for a precise edit on top of it. Not "make me a clip of a car in a jungle." But "take this car, in this pan, and put it in a jungle." The model holds the source. The change rides on top.
That's where it lands for anyone working in motion graphics or 3D. The boring middle of the pipeline.. set extension, sky replacement, the in-between fixes.. starts to collapse into one prompt and a clip. You stay in the director's chair. The execution moves.
Generating from zero is fine. The control is the story.
Where this is going.
Imagine the round-trip from prompt to output is near zero. You're watching a clip in real time. You say "put a hellscape in" and it appears. You say "remove the guy" and he's gone. That isn't editing anymore. That's directing. Live. As the footage is in front of you.
Three years feels about right.
Caveats, said quickly.
Forty-five minutes is not a review. I didn't push the audio. I didn't test long-form consistency. The image fidelity from scratch I barely touched. If you want a proper deep-dive of capability and edge cases, Atomic Gains has the best survey I've seen.. plenty of detail, use cases, tests. Worth your time.
I'll come back to this when I've put a real brief through it. For now.. impressed. The bit that landed wasn't the pixels. It was the feeling of holding a clip and editing the world inside it without losing the shot.