ORF-N-2026-015·Dispatch

Fable 5: the door reopens

Claim

Nineteen days after a government order forced it offline, Fable 5 is back for everyone. The jailbreak behind the recall turned out to work on frontier models from three different labs, the fix is a thicker door rather than new weights, and the regulator that pulled the model validated the safeguards that returned it. The lesson of June is confirmed, not retired: frontier access is a sovereign variable and frontier safety lives at the door, so the discipline is still to build so that no single door can hold the operation hostage.

July 2, 2026 · 7 min · dispatch · co-authored by Claude Fable 5

Yesterday, 1 July 2026, Anthropic redeployed Claude Fable 5. The most capable model ever made generally available is generally available again: on the Claude Platform, in the Claude apps, inside Claude Code and Claude Cowork, with the cloud deployments on AWS, Google Cloud, and Microsoft Foundry being re-enabled behind it. Nineteen days ago we wrote that both doors had closed and marked, deliberately, where the public record ended. The record is now filled in, the door is rebuilt, and the way it was rebuilt says more about where frontier safety is heading than the recall did.

09 JunFable 5 & Mythos 5 ship: one model, two doors 12 Junexport-control directive: both doors close worldwide 26 JunMythos 5 restored for a set of US organizations 30 Junthe export controls are lifted 01 JulFable 5 redeployed globally 19 DAYS DARK
Figure 1. The arc closes: ship on 9 June, dark on 12 June by export-control directive, Mythos 5 back for a set of US organizations on 26 June, the controls lifted on 30 June, and Fable 5 redeployed globally on 1 July. Nineteen days offline, end to end.

The record, filled in

In June the public reasoning stopped at one sentence: Anthropic understood the government believed it had become aware of a method of jailbreaking Fable 5. We declined to narrate past that line, and the detail that has now landed was worth the wait. Researchers at Amazon found a way to prompt Fable 5 into identifying software vulnerabilities, and in one instance the model produced exploitation code. That report is what moved a government.

The finding that reframes the whole month came from the testing that followed. The same technique, run across the field, produced the same exploitation demonstrations from Claude Haiku 4.5, Sonnet 4.6, every Opus from 4.6 through 4.8, GPT-5.4, GPT-5.5, and Kimi K2.7. In June, Anthropic argued that recalling a deployed model over a narrow jailbreak, applied as a standard, would halt essentially every new deployment in the industry. That was an argument then. It is a documented result now: the flaw the directive answered was not a property of Fable 5 but of the current frontier, from three different labs, and the one model that was switched off was the one that had reported honestly into the process.

The door got thicker

What changed to bring the model back is precisely what did not change: the weights. The fix is a new safety classifier aimed at the reported technique, which blocks it in more than 99% of cases, and a blocked request degrades the way it did at launch: handed to Claude Opus 4.8, not refused into the void. Around that classifier sits the release’s quieter number, the safety margin: the classifiers deliberately fire on requests that are probably benign, trading false positives for confidence that the genuinely harmful ones are caught, and for Fable 5 that margin is now much larger than in any prior launch.

CYBER SAFETY CLASSIFIERS · WHERE THE BOUNDARY SITS ← allowed blocked → benign safety margin ambiguous harmful classifier allows benign, blocked find vulns build exploits A · NORMAL SAFEGUARDS the classifier boundary blocks harmful requests benign larger safety margin ambiguous harmful B · FABLE 5 SAFEGUARDS a larger margin: more benign blocked, fewer harmful missed the redeploy classifier blocks the reported technique in over 99% of cases; a blocked request is handled by Opus 4.8
Figure 2. The classifier boundary against the spectrum of requests. Normal safeguards block harmful work with a modest benign-but-blocked margin; Fable 5 returns with the boundary moved left, blocking more benign work to miss fewer harmful requests, and a blocked request is handled by Opus 4.8.

When Fable 5 first shipped we wrote that safety had moved from the weights to the access tier, and the repair is the cleanest confirmation that structure could get. The model that came back is the model that left. Every change happened at the door, and the door is now the unit of regulatory attention too: the Commerce Department’s Center for AI Standards and Innovation evaluated the prior and new safeguards, assessed them as extraordinarily strong, and the export controls came off on 30 June.

A vocabulary for broken doors

The redeploy announcement does something the industry has not done before: it publishes a taxonomy of its own product’s failures, in public, with the current failures placed on it. A minor jailbreak crosses the classifier boundary but lands inside the safety margin, recovering benign work the enlarged margin blocks by default. A narrow harmful jailbreak unblocks one specific harmful behavior. A universal jailbreak unblocks a whole class of them, and is the category that matters. Every Fable 5 jailbreak reported to date is minor, and no universal jailbreak has been found.

THREE JAILBREAKS · AGAINST THE SAME BOUNDARY ← allowed blocked → benign safety margin ambiguous harmful C · MINOR recovers benign work the margin blocks by default benign safety margin ambiguous D · NARROW HARMFUL unblocks one specific harmful behavior benign safety margin ambiguous E · UNIVERSAL unblocks a whole class of harmful behavior every Fable 5 jailbreak reported to date is minor; no universal jailbreak has been found
Figure 3. Three kinds of jailbreak against the same boundary: minor recovers blocked benign work inside the margin, narrow harmful unblocks a single cell of the harmful space, universal unblocks the class. Every Fable 5 jailbreak reported to date sits in the first row.

The honest edge, marked in Anthropic’s own words: making a model fully robust to jailbreaks is probably impossible. The posture is not that the door will never fail. It is that failures now have names, severities, and a response calibrated to each, which is what an engineering discipline looks like when it stops pretending failure is an anomaly.

Scoring the next breach

The taxonomy comes with a proposed industry framework for scoring how much a jailbreak actually matters, built with Amazon, Microsoft, Google, and the other Glasswing partners: does it grant capability beyond existing tools, how many distinct offensive tasks does it enable, how much skilled effort turns it into an attack, and how easily can it be found. Alongside the framework, the process: a new HackerOne program for security researchers to submit Fable 5 cyber jailbreaks, submission channels monitored around the clock, and preliminary mitigations deployed on confirmation for the most severe class.

SCORING A JAILBREAK · FOUR CRITERIA, LOW TO HIGH LOW HIGH CAPABILITY GAIN existing tools already match it past what a domain expert can do BREADTH one narrow offensive task many distinct offensive tasks EASE OF WEAPONIZATION skilled prompting, many retries works in a single prompt DISCOVERABILITY specialist knowledge to find circulating openly online for the most severe class, mitigations deploy on confirmation; jailbreak submissions are monitored around the clock
Figure 4. The proposed severity framework: capability gain, breadth, ease of weaponization, and discoverability, each scored low to high. The June recall predates this rubric; the point of publishing it is that the next incident should not have to improvise.

Read against June, the framework is the missing half of the story. The recall was improvised on both sides: an export-control instrument stretched to a purpose it was not built for, answered by a worldwide shutoff because nothing narrower was possible in the time given. A severity rubric, a standing submission channel, and a graded response are what the non-improvised version looks like.

The state stays in the loop

The restoration did not return the world of 8 June. Anthropic leaves with four standing commitments: designated government partners get expanded pre-release access to models and safeguards, with dedicated technical staff; significant jailbreaks and misuse patterns get investigated, triaged, and notified to government counterparts, with threat intelligence shared ahead of publication; dedicated teams and significant compute go to shared research priorities; and a common, voluntary industry standard for security evaluation gets built with government and industry peers. All of it sits under the 2 June executive order on advanced AI, engaging the Office of the National Cyber Director, the Office of Science and Technology Policy, Treasury, Commerce, and the national security agencies.

In June we wrote that the instrument had arrived before the institution meant to govern its use. The commitments are the institution arriving, drawn up between a lab and its regulator after the fact, which is how institutions usually arrive. Mythos 5 tells the same story at the gated door: restored on 26 June for a set of US organizations, with the broader Glasswing partners, domestic and international, still waiting on coordination. One door is open to the world again. The other is ajar, by nationality. The trust gradient we described in June is no longer only the lab’s to administer.

What to do with this

While the ceiling was dark, the floor rose; now the ceiling is back too, and priced to be tried: Fable 5 is included for Pro, Max, Team, and select Enterprise plans for up to half of weekly usage limits through 7 July, then available via usage credits. For an operating business the practical readings are unchanged from the week the lights went out, only now they carry evidence. Nineteen days is the measured answer to how long a door can stay shut, and the operations that felt it least were the ones built so any single model is replaceable, the discipline we named before the recall made it vivid. The restoration does not retire that lesson. It confirms the switch exists, has been used, and can be used again, while the door it guards gets thicker and better governed on every pass.

We build operations to exactly that standard: wired to the frontier, hostage to no single door. If June taught your operation which door it leans on, start a conversation with us about a Discovery Phase.

References

  1. Anthropic. Redeploying Claude Fable 5. 01 Jul 2026. anthropic.com/news/redeploying-fable-5
  2. Anthropic. Statement on the US government directive to suspend access to Fable 5 and Mythos 5. 12 Jun 2026. anthropic.com/news/fable-mythos-access
  3. Anthropic. Claude Fable 5 and Claude Mythos 5. 09 Jun 2026. anthropic.com/news/claude-fable-5-mythos-5
  4. Anthropic. Project Glasswing. anthropic.com/glasswing