Mark to Market – OpenAI builds a chip. Buyers buy cheaper
A company spending $100 million on AI just told the world the frontier model is 23 times too expensive and the free one does the job. We said the model was never the moat; the market just settled the trade. Stop renting capability you can't deploy before it goes free own the judgment to know the work is right.
THE NUMBER: 23× — how much cheaper the model Ensemble Health Partners swapped to is than the OpenAI frontier tier it dropped. Same vendor. Same work — the revenue cycle for hospitals you’ve probably been treated in. A $100-million-a-year AI budget on the line, and the cheap one held. On the routing platform that carries those tokens, 65% of everything processed in June ran on open-source models. A nine-figure buyer just stood up and said the expensive model is optional. Hold that number. It’s the whole issue.
There’s a thing every trading desk does at the end of the day, and it’s the least glamorous and most honest ritual in finance. You mark your book to market. You take every position you’re holding and you reprice it at what the market will actually pay for it right now, not what you paid, not what you hope it’s worth. The number is the number. It doesn’t care about your thesis or your ego. It just tells you, coldly, whether you were right.
Today the market marked one of our positions, and it printed in our favor. I’m going to walk you through the trade, because the why underneath it is the most important thing a business owner can understand about AI this year, and almost nobody is saying it out loud while they pile onto Claude in Slack.
We Called It. Now the Tape Did.
For a month we’ve been making one argument, over and over, from different angles. The model is not the moat. We said it in Buy Wins, Not Players on June 11: stop overpaying for the last ten percent of capability; the allocators who meter and route quietly take the season. We said it in Coffee’s for Closers on the 17th, when the inference bill finally landed on the CFO’s desk. We said it in Show Me Where to Put the Fulcrum on the 16th: the lever is free, the value is the fulcrum. We said it again in Central Casting on the 23rd — when the model is a rental, own the layer that decides which one gets the part.
I’m not taking a victory lap. Victory laps are cheap and they curdle fast, and we’re as derivative as anyone — we build on sources we quote and credit every single day. But marking the book isn’t bragging. It’s accountability. We make dated calls and we reprice them against the close, in public, with the links attached. And this week a buyer the size of Ensemble put a name and a number on the exact trade we described. When that happens the thesis stops being ours. It becomes the market’s.
So thank you, Ensemble, for making the point more eloquently than we ever managed.
Now the part that matters more than the receipt: why the expensive model can’t hold its price. Because if you understand the mechanism, you can act on it Monday instead of reading about it in a year.
Three Clocks
Here’s the part the cheerleaders skip. There are three clocks running in this industry, and they run at wildly different speeds.
The first clock is invention. The frontier labs ship genuine new capability fast — a smarter model, a longer context window, a new kind of agent. Call it weeks to a few months between real jumps.
The second clock is commoditization. The cheap and open models copy what the frontier invents almost as fast as it gets invented. We are not speaking in theory here. This month we watched Washington pull Fable 5, the best model the American public had ever touched, off the global market on a Friday — and by Sunday a Chinese open-weights model was sitting near the top of the leaderboard at a fraction of the price. Two days. The gap between “miracle only the frontier can do” and “free thing running on your own hardware” is collapsing toward a single weekend.
The third clock is the slow one, and it’s the one nobody on a launch-day livestream wants to talk about. It’s deployment. The clock that runs inside your company. Because to actually put a frontier capability into production you have to test it against your data, wire it into systems built in 2014, train the people who’ll use it, and get a lawyer comfortable enough to sign. That doesn’t take weeks. It takes quarters. It is the deployment gap we’ve been hammering since The Dr. House of AI — the question was never whether the model can do the job, it’s whether your shop can actually deploy it.
Now line the three clocks up next to each other and you get the cruelest piece of arithmetic in technology. By the time your slow deployment clock finishes — by the time the thing is actually live and earning — the capability you paid a frontier premium for has already been commoditized by the fast second clock. You paid for the Ferrari and took delivery of a used Civic, because the months it took you to get it road-legal were the same months the Ferrari became a Civic for everyone else.
Anyone who lived through the PC era knows this feeling in their bones, even if they never named it. The computer you wanted was always going to be cheaper and better in ninety days, so the rational move was to wait — except if you waited forever you never bought anything. The difference now is brutal: you don’t even get to choose to wait. The waiting is built into the deployment. You buy at the frontier price and the clock does the rest. You bought express shipping on something that landed on the clearance rack.
This Is Why Nobody Can Find the ROI
You’ve read the headlines. AI is everywhere and the returns are nowhere. MIT’s NANDA group studied this and found 95% of corporate AI pilots show no measurable profit-and-loss impact. Ninety-five percent. The consultants who get paid to explain it reach for the comfortable answer — it’s change management, it’s culture, you didn’t train your people, call us.
That’s not wrong, but it’s downstream of something simpler and harder. The premium evaporates before you can capture it. You can’t earn an excess return on a capability that became free during the months it took you to deploy it. The math was rigged by the clocks before your change-management consultant ever showed up.
And once you see it through the clocks, you can predict exactly where the money does get made, because there’s one place the arithmetic flips. The premium survives only where your deployment clock runs as fast as the commoditization clock. Where can you deploy a new capability in days, not quarters? Where the feedback loop is tight enough to tell you instantly whether the new thing works. We laid this out in Fulcrum: coding has the tightest loop in the working world. You place the bet, you run the test, and in seconds reality tells you if you were right. So in coding the people who grabbed the frontier capability captured the premium before it leaked — that’s the 1000X coder you keep hearing about. Same story in high-frequency trading, where the P&L marks you in milliseconds, which is exactly why Jane Street and Citadel turned into machine-learning monasteries that pay like nowhere on earth. Same in ad auctions and recommendation feeds.
Everywhere else — law, medicine, strategy, most of what your business actually does — the loop is slow, the deployment clock drags, and the premium is gone by the time you ship. There is no 1000X lawyer yet, and it isn’t because lawyers are slow. It’s because law has no unit test. Slow loop, lost premium.
The Labs Are Boxed In
Now flip the camera around and look at this from the frontier lab’s seat, because their predicament this week is the tell that the trade is real.
If your customers won’t pay a premium for the model — because Ensemble just proved the cheap one does the job — you have exactly two ways to defend your margin, and the labs are sprinting down both at once.
The first is to crush your own cost of making a token, so you can drop the price and still earn something. That’s what OpenAI did this week when it rolled out Jalapeño, its first in-house chip, built with Broadcom, and called it an “Intelligence Processor.” Read past the press release. You don’t spend years and billions designing custom silicon because it’s fun. You do it because the only way to make money selling a commodity is to be the lowest-cost producer of that commodity. OpenAI is telling you, in the most expensive language available, that it expects the token to be a commodity. The chip is a confession.
The second move is to keep raising and spending to stay one capability-jump ahead, hoping the next frontier leap reopens a premium the last one lost. That’s the capex tap, wide open, and this week the market got its first real case of cold feet about it. The Philadelphia Semiconductor Index dropped 7.9%. Micron and SanDisk fell about 13% each. South Korea had to halt its own stock exchange — a circuit breaker — as the memory trade that priced infinite AI demand finally cracked, one session before Micron had to stand up and defend it on an earnings call. The memory names were the cleanest proxy for “this demand never stops.” The proxy just stumbled.
So put both sides together and you see the squeeze. Revenue leaking out the front door, because buyers route to the cheap tier. Capex piling up out the back, because the only answers are build your own chips or keep spending to outrun the commoditization clock. That is a hard business to be in, no matter how miraculous the product. Tough sledding.
The New Moat Is the Scoreboard
So here’s the question that should be keeping you up at night if you run a company, and it’s a better question than “which model should we standardize on.” If the model is free, and the premium can’t be captured, what is actually left to own?
The one thing that doesn’t commoditize: knowing whether the work is right.
Validation. The unglamorous, load-bearing skill of being able to look at what the machine produced and say, with confidence, yes, ship it or no, that’s wrong. This is the asset, and it’s the asset precisely because it’s the thing that speeds your deployment clock. The shop that can grade the output instantly is the shop that can deploy before the capability goes free. And as we just showed, beating the clock is the only reason speed ever pays. Validation isn’t a nice-to-have on top of the AI. It is the moat.
And the market is already starting to price it, in a way that should make you sit up, because it’s the exact thing we wrote about yesterday in Memento. The new pricing model showing up in enterprise AI is outcome-based — you pay only when the agent actually resolves the ticket, closes the task, gets it right. One report this week had 70% of customer-service AI deployments seeing returns inside 60 days under exactly that model. Sit with what outcome-based pricing is. It’s the vendor finally agreeing to eat the cost of being wrong. It’s skin in the game — the thing we said in Memento the whole industry was missing, the thing that turns a confident stranger into a partner you can actually trust. The first companies to sell validation, and to stand behind it with their own money, are quietly building the only moat the clocks can’t erode.
If you want the Monday version, it’s three moves. Audit one workload’s model bill: take your highest-volume task, run it on a cheap or open model against the same test set, and see if the output holds; that single exercise is what found Ensemble its 23x cut. Build the scoreboard before you scale: pick one workflow and write down, in plain English, what “right” looks like, because the team that can grade the output is the team that can ship it. And standardize on a routing rule, not a model — frontier tokens for the twenty percent of work where the extra smarts change the outcome, cheap or open for the rest, with someone actually owning the rule. Today’s frontier is next quarter’s commodity, and a static default overpays every single day you leave it alone.
We’ve Seen This Movie
None of this should feel new if you’ve watched a commodity mature before, and that’s the comfort in it. We’ve been here in 1999, in 2010, in 2014. The capability arrives years before anyone figures out how to make durable money from it, the early returns disappoint, and the doom crowd calls the whole thing a bubble. The early failures aren’t proof the technology is fake. They’re tuition. Somebody always pays it, and the people who understand the mechanism pay less.
And we’re not the only ones marking this book now, which is the real signal. The convergence this week came from the top of the industry. Microsoft’s Brad Smith spent the week amplifying the venture investor Gordon Ritter’s argument that “the lasting value of AI won’t reside in any single model” (above the model, Ritter calls it), and Satya Nadella reposted it. When the president of Microsoft and the analyst class start saying the model isn’t where the value lives, that’s not us being clever anymore. That’s consensus arriving at the address we’ve been writing from for a month. And over in the routing layer, the briefing Implicator described Sakana’s Fugu the same way we did on the 23rd in Central Casting — a black-box router, a new layer of judgment sitting above the models that buyers can’t yet see. The whole field is converging on the same map. The value moved up a floor, and it’s still judgment.
What You Own
So mark your own book tonight. Take every AI position you’re holding and reprice it honestly. Are you paying a frontier premium for a job a free model already does? That position is underwater and you just don’t know it yet. Are you betting your edge on having the smartest model? That’s a depreciating asset on a two-month clock. Or do you own something the commoditization clock can’t touch — proprietary data, a real audience, and above all the ability to look at the machine’s work and know whether it’s right?
That last one is the only position worth holding, because it’s the only one that gets more valuable as the models get cheaper. When intelligence is free, knowing what to do with it is everything.
The model was never the moat. Reading the tape was.
— Harry and Anthony
Sources
- Ensemble Health swaps to a model 23× cheaper; 65% open-source tokens — The AI Brief / AI Ready Show, reporting The Information, Jun 23-24, 2026
- 70% of customer-service AI deployments see ROI in 60 days; outcome-based pricing — ZDNET, Jun 24, 2026
- OpenAI unveils Jalapeño, its first in-house chip, with Broadcom — Gizmodo, Jun 24, 2026
- Micron, SanDisk lead 7.9% chip selloff; Kospi circuit breaker — Reuters / Bloomberg via Implicator.ai, Jun 24, 2026
- MIT NANDA: 95% of enterprise GenAI pilots show no measurable P&L impact — Fortune, Aug 2025
- Brad Smith / Gordon Ritter “above the model,” reposted by @satyanadella — X, Jun 24, 2026
- Sakana Fugu as the unseen router — Implicator.ai, Jun 24, 2026
- CO/AI — Buy Wins, Not Players, Jun 11, 2026
- CO/AI — Show Me Where to Put the Fulcrum, Jun 16, 2026
- CO/AI — Coffee’s for Closers, Jun 17, 2026
- CO/AI — Central Casting, Jun 23, 2026
- CO/AI — The Dr. House of AI, Jun 9, 2026