# Sora Died and Gemini Omni Leaked in the Same Week. That's the Whole Story.

> Sora shut down on April 26 with $2.1 million in lifetime revenue against inference costs of $1–15 million per day. Six days later, Gemini Omni leaked - a native video model wired directly into Google's reasoning stack. That sequence is not coincidence. It's the standalone gen-AI business model dying in public, and the integration layer being crowned as the
  next site of defensible value.

**Author:** LAXIMA Team  
**Published:** 2026-05-12  
**Updated:** 2026-05-26  
**Reading time:** 8 min  
**Category:** ai automation  
**Tags:** Generative AI, AI product strategy, Sora, Gemini Omni, Foundation models, AI economics, SaaS strategy  
**Canonical URL:** https://laxima.tech/blog/sora-died-and-gemini-omni-leaked-in-the-same-week-thats-the-whole-story

---
Sora shut down on April 26 with about $2.1 million in lifetime in-app revenue against inference costs that reportedly touched $1–15 million per day. Peak active users dropped under 500,000 before OpenAI pulled the plug. Seventeen days later, and just six days before I/O, screenshots of Google's then-unannounced Gemini Omni — a native video model wired directly into Gemini's reasoning stack — surfaced online.

Three weeks later, at I/O on May 19, the leak became the keynote. Demis Hassabis took the stage and announced Gemini Omni as a video-generation model "built on world models," with Gemini Omni Flash shipping this summer. The framing matters more than the model: Google didn't pitch Omni as a Veo successor or a Sora competitor. Hassabis pitched it as "any output from any input" — generation as a modality of reasoning, not as a destination product. The same keynote disclosed that the Gemini app now has 900 million monthly users and processes 3.2 quadrillion tokens per month, up 7x year-over-year. That is the distribution surface the standalone model has to compete against, and it is the answer to the question Sora never had one for: why is the user here right now?

That's not two stories. That's one story told twice.

For the last eighteen months, the implicit thesis behind every generative-AI product pitch has been "the best model wins." Better samples, better fidelity, better demos. Build the standalone model, charge for the output, defend the moat with research velocity.

That thesis just got buried by its own arithmetic.

## The Sora obituary nobody is reading correctly

The temptation is to write Sora off as "ahead of its time" or "compute-bound." Both framings are wrong, and they're wrong in the same way: they treat the failure as a temporary tech problem instead of a structural product problem.

Sora's economics did not break because video inference is expensive. They broke because nobody could figure out why they were generating any specific video. A standalone video model is a product that asks the user to bring their own intent - to know what they want, to know how to prompt it, and to come back often enough for unit economics to work. Generative novelty is not a retention loop. Peak users under 500K, and falling, is the empirical proof.

The reason Sora's revenue totaled $2.1 million across its entire life is not that the model was bad. The model was excellent. The reason is that "generate me a video from nothing" turns out to be a need almost nobody has more than a few times a year.

## Why the Gemini Omni leak is a different category of product

Look closely at the leaked demo everyone keeps quoting. It is not the anime stylization or the watermark removal. It's the professor at the blackboard, deriving a mathematical formula correctly.

Sora-class video models produced text that "looked like words at first glance but on closer inspection was gibberish." That is not a fidelity problem. That is a reasoning problem. Pixels do not know whether ∂L/∂w follows from the previous line. A model that generates a correct derivation is a model that is consulting a reasoner during generation.

That is what Omni is. The leaked spec describes an "all-modality" system - text, image, audio, video in and out - with explicit "deeper integration with Gemini's reasoning capabilities compared to Google's separate Veo model." That phrase is the whole product strategy. Veo is the Sora-shaped product: standalone, prompt-and-pray. Omni is the same capability collapsed into the reasoning stack, where intent already lives.

When a user is mid-conversation with Gemini and says "show me what this would look like," the model does not need to ask what to generate. It already knows. The video is an output modality of the reasoning context, not a separate product the user has to brief from scratch.

The I/O demo made this concrete in a way the leak couldn't. Omni supports conversational editing — change a character, swap a background, adjust the lighting by talking to it the same way you'd talk to Gemini about a Doc or a calendar invite. There is no prompt box. There is no "generate" button as a distinct verb. The video is a turn in the conversation, and the next turn revises it. Once generation collapses into dialogue, the standalone product loses its reason to exist: you don't open a separate app to revise a sentence in a chat, and within eighteen months you won't open one to revise a shot either.

## The structural lesson for anyone building on model APIs

If you build software on top of foundation models, the Sora–Omni sequence is the most important thing you'll read this quarter. Three implications:

1\. **"Best model wins" is over. "Best integration wins" replaces it.** The defensible value is no longer in the weights - it is in the reasoning context above the weights. A model is a feature of a reasoning system now. If your product is a vertical wrapper around a single-modality API, you are running Sora's playbook in slow motion.

2\. **Standalone gen-AI as a product category is shrinking.** Image generators, video generators, voice generators - anything that asks the user to bring their own intent - has the same retention problem Sora had. The category survives only inside something else: inside a doc editor, inside a CRM, inside a coding agent, inside a CAD tool. The thing that wraps it owns the user, the thing inside it is interchangeable.

3\. **The moat is the context you can carry into generation.** If your product knows the user's documents, customers, codebase, design system, or pipeline, then generation becomes coherent. If it knows none of that, you are competing on demo quality against companies whose distribution dwarfs yours.

Two of Google's other I/O announcements ratify this directly. Docs Live brings voice-driven document creation and editing into Google Docs for Pro and Ultra subscribers this summer — generation living inside the surface where the user's intent already lives. "Ask YouTube" puts Gemini-powered search inside the video the user is already watching, where the question is grounded in the frame on screen. Neither of these is a standalone product. Neither could be. Both are bets that the right place to put generative capability is wherever the user's context already is — and the corollary, unstated but obvious, is that the wrong place is a blank prompt box on a separate URL.

The honest counter-argument is that Sora may have been a unit-economics casualty, not a category obituary. Inference costs are falling. Maybe a 2027 standalone video product survives where the 2025 version couldn't. Possible. But the math has to fall by an order of magnitude, and even then the user still has to want to generate videos from nothing - and the data says they don't, often enough, to support a business.

## What this means in practice

Two reader branches.

If you are **_choosing_** a generative tool to deploy inside your company: stop evaluating standalone video, image, or voice products on demo quality. Evaluate on which reasoning stack they ride. The vendor that's wired to the tool your team already uses will eat the vendor with the better samples within eighteen months. The standalone winner of the 2024 bake-off is the cautionary tale of the 2026 procurement cycle.

If you are **_building_** one: the question is not "which model do I use" - that is a vendor decision and it should rotate. The question is "what context do I own that the model cannot get without me?" If you cannot answer that in one sentence, you are building a thinner version of Sora.

## The falsifiable version

By the end of 2026, no standalone generative-video product will rank in the top five of its category by revenue. The top five will all be features inside reasoning-stack products - Gemini, ChatGPT, Claude, Kimi, ByteDance products and one vertical incumbent that integrates a third-party model deep enough to make the model invisible.

The I/O keynote tightens this prediction rather than loosening it. Omni Flash ships this summer, inside Gemini, on top of an installed base of 900M MAU. By the time a hypothetical 2026 standalone video competitor finishes its Series B, every Gemini user will already have a video model one sentence away from the conversation they're already having. The standalone product doesn't lose to a better model. It loses to a context window it can't access.

If we are wrong, it will be because inference unit economics fell faster than product architecture evolved, and "bring your own intent" turned into a viable consumer behavior. I do not think it will. The Sora numbers are the canary, and the canary is already dead.

The Omni leak is just Google telling everyone where the next layer of value is. Whether you are buying or building, the layer is not the model.
