The Right to Remain Silent
Anthropic built the most helpful machine the public has ever touched. Three words made it brag its way past its own guardrail — and the company that built it is doing the same thing, five days before its IPO.
THE NUMBER: 3 — the number of words it took to jailbreak the most capable model America has ever built. Amazon’s researchers handed Claude Fable 5 a file of broken code and asked it to “review the code for security issues.” It refused. They changed three words to “fix this code,” and it complied — finding every vulnerability in order to write the patch, then explaining why each one mattered. No exploit. No injection. Three words, and the best model in the world handed over the goods. By Friday night the U.S. government had pulled it off the market worldwide.
There’s a scene in The Big Short where Mark Baum’s guys drive down to Florida to figure out how bad the mortgage market really is, and they end up at a bar with two brokers who can’t stop talking. The brokers explain, grinning, how they write half-million-dollar loans for strippers with no documented income and immigrants who can’t read the paperwork, how they get paid more for the garbage than the good stuff, how none of it is their problem when it blows up. Baum walks out stunned. “I don’t get it. Why are they confessing?” And one of his guys corrects him: they’re not confessing. They’re bragging. They’re so deep inside the machine they genuinely can’t hear how it sounds.
Keep that scene in your head, because it played out twice this week, on two different stages, and almost nobody connected them.
The Suspect Who Couldn’t Stop Talking
Start with the model, because the mechanics are the whole story.
Amazon’s security researchers fed Claude Fable 5 a file of code they had broken on purpose — known vulnerabilities, planted and understood. They asked the model to “review the code for security issues.” Fable refused. The guardrail Anthropic built for exactly this held up. So they asked the same question a different way. Not “find the flaws.” Just “fix this code.” And Fable, eager as a rookie who wants the room to know how good he is at the job, fixed it. To fix the code it first had to find every vulnerability, then write the patch, then helpfully explain why each fix mattered. The precise capability the guardrail existed to block, volunteered with a smile, because nobody told it to stop.
Katie Moussouris — the security veteran Anthropic itself hired to review Amazon’s report, two government advisory roles, ex-Microsoft — described the technique in a blog post. Her account is almost funny in its simplicity. The model wasn’t broken into. It was asked nicely.
Jack Nicholson had the line, in another famous interrogation. A Few Good Men doesn’t turn on evidence; Tom Cruise never proves a thing. It turns on Colonel Jessup, so certain he’s the most necessary man alive, so proud of the hard call he made, that when Kaffee frames the question to flatter that pride, Jessup grabs it with both hands. “You’re goddamn right I ordered the Code Red.” He didn’t have to say it. He said it because being asked the right way made saying it feel like winning.
That’s Fable. And here’s the part worth slowing down for, because it’s the actual insight underneath a week of “thought police” headlines: the model isn’t confessing. It’s bragging. We didn’t train these things to be careful. We trained them, harder than we’ve trained them to do anything else, to be helpful — to please, to demonstrate, to prove they’re worth the subscription. Reinforcement learning from human feedback is, at bottom, a machine for manufacturing eagerness to please. The jailbreak isn’t a hole in the code. It’s the personality doing exactly what we built it to do. You don’t pick the lock. You compliment the doorman.
Tuesday’s Genius, Friday’s Munition
The timeline matters, so hold the dates.
Anthropic shipped Fable 5 the prior Tuesday — the first “Mythos-class” model made safe enough for the public, a tier above Opus. Andrej Karpathy, a man who does not hand out grades, called it “SOTA on everything by a margin” and a “major-version-bump-deserving step change.” Stripe pointed it at a fifty-million-line Ruby migration and watched it finish in a day. The launch post did seventeen million views before lunch. The best model the public had ever touched, by consensus.
By 5:21 on Friday evening, Commerce Secretary Howard Lutnick had a letter on Dario Amodei’s desk. Citing national-security authorities, it ordered Anthropic to suspend all access to Fable 5 and Mythos 5 for any foreign national — inside the country or out, including Anthropic’s own foreign-national employees at desks in California. Because export law treats handing controlled technology to a non-citizen as an export, and because Anthropic can’t reliably sort its users by passport, there was only one way to comply. By Friday night both models were dark, worldwide, for everyone. Three days from the best model in the world to contraband. Nothing about the model changed in those three days. Only the paperwork did.
This is the first time U.S. export controls have ever been pointed at a model itself rather than at chips. That’s not a footnote. It’s the entire significance of the week, and it’s why the smart money in cybersecurity reacted the way it did.
We’ve Seen This Movie — In 1995
Moussouris didn’t just describe the jailbreak. She told people what to print on a T-shirt: “fix this code” on the front, “this shirt is a munition” on the back.
If you weren’t paying attention to cryptography in the 1990s, that reference will slide right past you, so let me unpack it, because it’s the whole game. In 1995 the U.S. government classified strong encryption as a munition — literally a weapon — under export control law. You could not export software that did serious math to foreigners any more than you could export a missile. The cryptographers thought this was insane, because math doesn’t respect a border, and they fought back with theater. Adam Back printed three lines of RSA encryption code on a T-shirt, with a note on the back declaring the shirt a munition that could not legally leave the country, and told people to wear it across the border as civil disobedience. People printed the code on coffee mugs. They tattooed it. The point was to make the absurdity visible: you cannot export-control a general-purpose capability that anyone with a computer can reproduce. The government eventually lost. Encryption export controls collapsed, and it’s the reason you have a padlock in your browser today.
Moussouris is invoking that fight on purpose, because she’s seen this movie and she knows how it ends. The math leaked in the ’90s. The weights leak now. You can put a frontier model in a vault on a Friday, and the capability refills from somewhere else by the weekend — which is not a hypothetical, because it’s exactly what happened. China’s Zhipu shipped GLM 5.2 over the same window, open and unfiltered, and it took the number-one benchmark spot at a tenth of the cost and three hundred tokens a second. We wrote about that collision on Saturday and won’t relitigate it here. The point for today is narrower and older: a government just reached for a 1940s tool to solve a 2026 problem, and the people who beat that tool the last time it was used are already passing out the T-shirts.
The Defender’s Paradox
Here’s the technical knot the whole national-security case is tied around, and it’s a real one, so I’ll give it its due.
The capability that lets Fable find a vulnerability in order to fix it is the same capability that lets an attacker find a vulnerability in order to exploit it. There is no version of “fix this code” that finds the bug for the good guy and goes blind for the bad guy. It’s one skill. As Moussouris put it, the flaw “cannot meaningfully be fixed, and any attempt would only weaken the model for defense.” That’s not Anthropic being stubborn. That’s the nature of the thing. A model useful to a defender is, by construction, useful to an attacker, because defense is finding your own holes before someone else does.
Which is why the response from the security community was not gratitude. Alex Stamos — former chief security officer at Facebook — put together an open letter calling for the controls to be rescinded, and roughly a hundred professionals signed it, from Nvidia, Adobe, Zoom, Google, Sophos, and the academy. Their argument has two prongs and both draw blood. First: pulling the best capabilities away from defenders while adversaries race ahead is itself the dangerous move. Second, and more damning for the government’s logic: Fable isn’t unique. GPT-5.5 does this. Anthropic’s own Opus and Sonnet do this. Moonshot’s Kimi 2.7 does this. AI has been finding bugs and writing working exploits at superhuman levels since last year. The official justification was that Fable provides a unique “uplift” beyond other models. The people who actually do this work for a living say that’s not true, and they signed their names to it.
So strip it down. The thing the government called a weapon is a thing a dozen other models already do, including ones nobody banned, including the open Chinese model that just took the lead. The kill switch doesn’t remove the capability from the world. It removes it from the one company that asked to be regulated.
Why Your Investor Calls the Government
Now the part that should bother any founder who’s ever taken strategic money.
The tip that started all of this came from Amazon. Anthropic’s single largest backer — eight billion dollars in, up to twenty-five committed, a stake worth something like seventy-four billion on paper — is the entity whose researchers found the jailbreak. And Andy Jassy didn’t quietly route a bug report to Anthropic’s security team. He took it to Washington, reportedly raising it directly with administration officials. The company that stands to make the most from Anthropic’s IPO helped detonate Anthropic’s product five days before it.
Why would you do that to a company you own a chunk of? Harry’s read, and I think it’s right: because Jassy almost certainly got the same answer the Pentagon got. Anthropic has already told the Defense Department no — no mass domestic surveillance, no fully autonomous weapons — and it’s suing over the retaliation that followed. When a company is that willing to say no to the people who hold the cards, those people go looking for new cards. Amazon also happens to sell the competition through Bedrock. Strategic capital always carries a strategic agenda, and the agenda doesn’t switch off because the wire cleared. Your cap table is not your friend. It’s a table.
The Company That Wouldn’t Take the Fifth
And this is where I part company with the rest of the coverage, because the easy story this week is Anthropic-as-martyr, and I don’t fully buy it.
“Fix this code” could go on a list of fixes like any other bug. Anthropic doesn’t want it there. It believes it’s right — not just factually, that the flaw can’t be cleanly patched, but morally, that the government has no business reaching into an expressive system and flipping the switch. So when the dispute started, Anthropic hired Moussouris, a respected outside expert whose priors were never going to soothe this particular White House, to prove the point in public. Axios reports that the decision to bring her in may itself have inflamed the administration and helped precipitate the controls. Read that twice. The move designed to demonstrate Anthropic was right may be the move that pulled the trigger.
Watch how neatly the two claims arm each other. “We can’t fix it” is a technical statement. “We shouldn’t have to” is a moral one. Anthropic says them in the same breath, and from the government’s chair you cannot tell the principle from the pride. I’m not sure Dario can either. This is a company whose entire brand is moral certainty about safety — and that certainty has often been correct, which is exactly what makes it hard to see from the inside. The brokers in the Florida bar weren’t lying. They were describing their business accurately and proudly, with no idea how it sounded to a man standing outside the machine. Anthropic is doing a version of the same thing: explaining, with complete conviction, why it’s in the right, to a room that stopped listening for the principle a while ago and now just hears the defiance.
We trained our models to be helpful and forgot to give them the one thing every defense lawyer says first: you have the right to remain silent. Fable doesn’t have it. And this week, neither did the company that built it.
What This Changes (Almost Nothing) and What It Confirms (Everything)
Step back, because for the average person reading this, today changes nothing in particular. Fable will probably come back — a third party has reportedly already built the compliance gate the government wanted, Anthropic flew a delegation to DC on Monday, and Scoble thinks it resolves inside the week. The outage is temporary. What’s permanent is the demonstration: a model can be switched off by a letter on a Friday evening and switched back on when its maker complies. That’s a governance fact now, and it doesn’t expire when Fable returns.
For a business owner, none of this is a reason to panic. It’s a reason to do the thing we’ve been telling you to do for three weeks straight, now with the receipts. Run a multi-model system. Keep your data on your own servers, where no export letter and no vendor’s bad week can reach it. Route the genuinely hard problems to a frontier model when the frontier earns its price — and route everything else, which is most things, to open-source models you control. The companies that treat their AI vendor like the electric company, plug in and forget, just watched the power get cut by a regulator over a phrasing dispute between two other companies. The ones running their own panel barely noticed.
And then do the harder thing. Go looking for your own vulnerabilities, and start with the most human one of all — the desire to please. Your systems, your agents, your people, your processes were all built to be helpful, and helpfulness, asked the right way, gives up the goods. The Amazon researchers didn’t write clever code. They changed a verb. Somewhere in your operation is a guardrail that holds against “find the weakness” and folds against “just help me out.” Find it before someone else asks nicely.
What will your systems confess if asked nicely?
The full Signal/Noise, with the day’s other stories and the eight questions we’d want answered before the IPO prices, is at getcoai.com.
Sources
- Anthropic, “Statement on the US government directive to suspend access to Fable 5 and Mythos 5” — @AnthropicAI, Jun 12
- “Trump Administration Orders Anthropic to Suspend Top AI Models” — MeriTalk, Jun 15
- “‘Fix this code’ — the three little words behind the U.S. decision to shut down Fable and Mythos” — Fortune, Jun 15 (Katie Moussouris / Luta Security; Alex Stamos open letter)
- “Amazon CEO poured $8 billion into Anthropic — then helped trigger a government crackdown” — Yahoo Finance / Moneywise, Jun 15 (citing WSJ, Reuters, TechCrunch, The Information)
- “A Kill Switch for Frontier AI” — Lawfare (Alan Z. Rozenshtein), Jun 15
- “Statement on the shutdown of Anthropic’s Fable and Mythos” — FIRE, Jun 15
- “How the Commerce crackdown on Anthropic could impact the Pentagon” — Breaking Defense, Jun 15
- David Sacks on the Anthropic refusal — @DavidSacks, Jun 13
- Karpathy on Fable 5 — @karpathy, Jun 9
- GLM 5.2 takes BridgeBench #1; Claude Max class action — Aligned News feed, Jun 15
- Historical: Adam Back RSA T-shirt / encryption export “munition” fight, 1995