Sora is Dead. Gemini Omni is a Feature Now.

Sora shut down on April 26 with about $2.1 million in lifetime in-app revenue against inference costs that reportedly touched $1–15 million per day. Peak active users dropped under 500,000 before OpenAI pulled the plug. Six days later, screenshots of Google's unannounced Gemini Omni - a native video model wired directly into Gemini's reasoning stack - surfaced ahead of I/O.

That's not two stories. That's one story told twice.

For the last eighteen months, the implicit thesis behind every generative-AI product pitch has been "the best model wins." Better samples, better fidelity, better demos. Build the standalone model, charge for the output, defend the moat with research velocity.

That thesis just got buried by its own arithmetic.

The Sora obituary nobody is reading correctly

The temptation is to write Sora off as "ahead of its time" or "compute-bound." Both framings are wrong, and they're wrong in the same way: they treat the failure as a temporary tech problem instead of a structural product problem.

Sora's economics did not break because video inference is expensive. They broke because nobody could figure out why they were generating any specific video. A standalone video model is a product that asks the user to bring their own intent - to know what they want, to know how to prompt it, and to come back often enough for unit economics to work. Generative novelty is not a retention loop. Peak users under 500K, and falling, is the empirical proof.

The reason Sora's revenue totaled $2.1 million across its entire life is not that the model was bad. The model was excellent. The reason is that "generate me a video from nothing" turns out to be a need almost nobody has more than a few times a year.

Why the Gemini Omni leak is a different category of product

Look closely at the leaked demo everyone keeps quoting. It is not the anime stylization or the watermark removal. It's the professor at the blackboard, deriving a mathematical formula correctly.

Sora-class video models produced text that "looked like words at first glance but on closer inspection was gibberish." That is not a fidelity problem. That is a reasoning problem. Pixels do not know whether ∂L/∂w follows from the previous line. A model that generates a correct derivation is a model that is consulting a reasoner during generation.

That is what Omni is. The leaked spec describes an "all-modality" system - text, image, audio, video in and out - with explicit "deeper integration with Gemini's reasoning capabilities compared to Google's separate Veo model." That phrase is the whole product strategy. Veo is the Sora-shaped product: standalone, prompt-and-pray. Omni is the same capability collapsed into the reasoning stack, where intent already lives.

When a user is mid-conversation with Gemini and says "show me what this would look like," the model does not need to ask what to generate. It already knows. The video is an output modality of the reasoning context, not a separate product the user has to brief from scratch.

The structural lesson for anyone building on model APIs

If you build software on top of foundation models, the Sora–Omni sequence is the most important thing you'll read this quarter. Three implications:

1. "Best model wins" is over. "Best integration wins" replaces it. The defensible value is no longer in the weights - it is in the reasoning context above the weights. A model is a feature of a reasoning system now. If your product is a vertical wrapper around a single-modality API, you are running Sora's playbook in slow motion.

2. Standalone gen-AI as a product category is shrinking. Image generators, video generators, voice generators - anything that asks the user to bring their own intent - has the same retention problem Sora had. The category survives only inside something else: inside a doc editor, inside a CRM, inside a coding agent, inside a CAD tool. The thing that wraps it owns the user, the thing inside it is interchangeable.

3. The moat is the context you can carry into generation. If your product knows the user's documents, customers, codebase, design system, or pipeline, then generation becomes coherent. If it knows none of that, you are competing on demo quality against companies whose distribution dwarfs yours.

The honest counter-argument is that Sora may have been a unit-economics casualty, not a category obituary. Inference costs are falling. Maybe a 2027 standalone video product survives where the 2025 version couldn't. Possible. But the math has to fall by an order of magnitude, and even then the user still has to want to generate videos from nothing - and the data says they don't, often enough, to support a business.

What this means in practice

Two reader branches.

If you are choosing a generative tool to deploy inside your company: stop evaluating standalone video, image, or voice products on demo quality. Evaluate on which reasoning stack they ride. The vendor that's wired to the tool your team already uses will eat the vendor with the better samples within eighteen months. The standalone winner of the 2024 bake-off is the cautionary tale of the 2026 procurement cycle.

If you are building one: the question is not "which model do I use" - that is a vendor decision and it should rotate. The question is "what context do I own that the model cannot get without me?" If you cannot answer that in one sentence, you are building a thinner version of Sora.

The falsifiable version

By the end of 2026, no standalone generative-video product will rank in the top five of its category by revenue. The top five will all be features inside reasoning-stack products - Gemini, ChatGPT, Claude, Kimi, ByteDance products and one vertical incumbent that integrates a third-party model deep enough to make the model invisible.

If we are wrong, it will be because inference unit economics fell faster than product architecture evolved, and "bring your own intent" turned into a viable consumer behavior. I do not think it will. The Sora numbers are the canary, and the canary is already dead.

The Omni leak is just Google telling everyone where the next layer of value is. Whether you are buying or building, the layer is not the model.

Sora Died and Gemini Omni Leaked in the Same Week. That's the Whole Story.

The Sora obituary nobody is reading correctly

Why the Gemini Omni leak is a different category of product

The structural lesson for anyone building on model APIs

What this means in practice

The falsifiable version

Anthropic Just Passed OpenAI in Revenue: What $30B ARR Means for Enterprise AI Buyers

The Executive’s Guide to RAG: Turning Company Data into Trusted Intelligence

Claude Design: From Prompt to Prototype

The Sora obituary nobody is reading correctly

Why the Gemini Omni leak is a different category of product

The structural lesson for anyone building on model APIs

What this means in practice

The falsifiable version

Related Articles

Anthropic Just Passed OpenAI in Revenue: What $30B ARR Means for Enterprise AI Buyers

The Executive’s Guide to RAG: Turning Company Data into Trusted Intelligence

Claude Design: From Prompt to Prototype