Anthropic releases Mythos-class model for public use

After initially holding back its state-of-the-art large language model (LLM) Claude Mythos Preview from public consumption, artificial intelligence company Anthropic has released a new AI with similar capabilities, in versions with different safeguard layers.

Claude Fable 5 and Mythos 5 use the same underlying model, but the former comes with stricter classifiers that trigger when it encounters prompts on biology, chemistry and cyber security.

If that happens, Fable 5 drops down to the less capable Opus 4.8.

The classifiers will trigger in less than five percent of usage sessions, Anthropic claimed, while conceding it has tuned the safeguards conservatively, meaning they will catch some harmless prompts.

Fable 5 also has a classifier to detect large-scale attempts to distil its capabilities into competing models, and a separate, covert intervention that limits the model’s usefulness when it detects someone trying to build a competing frontier AI.

Unlike the other classifiers, the latter does not trigger a fallback or notify the user.

Anthropic said the Fable 5 classifiers are robust enough to enable a release of the model to the public, and is is available at no additional cost until June 22 US time for paid subscribers.

After that date, Claude subscribers will have to stump up usage credits for Fable 5, billed in advance with a daily redemption limit of US$2000.

Anthropic said that it intends to restore Fable 5 to its subscription plans as quickly as it can, when “sufficient capacity allows us to do so”

The model is fully available over Claude’s application programming interface (API) and Anthropic’s consumption based Enterprise plans.

Fable 5 consumes double the amount of tokens (the mathematical representations of input to the AI) as Opus 4.8; it costs US$10 per million input tokens, and US$50 per million output tokens to use.

Whether or not Anthropic charges the same amount when the Mythos-class Fable 5 drops down to Opus 4.8 for certain prompts is unclear.

A potentially controversial condition for accessing Claude Fable 5 and Mythos 5 is that they come with a 30-day data retention policy.

For its earlier models such as Claude Opus and Sonnet, paid users can opt out of data retention, which is a requirement for regulated industries.

Anthropic said it will not use the retained data for model training, and that the data will be deleted after the 30-day period.

It said the reason for the data retention is to help the company detect novel attacks and to reduce false positives.

Claude Mythos 5 available under Project Glasswing

The Mythos 5 branded model meanwhile has had some of the Fable 5 included safeguards lifted, Anthropic said.

Mythos 5 will be deployed through Anthropic’s Project Glasswing with select participants such as the US government, and potentially Australian organisations using the prior Claude Mythos Preview as per June this year.

iTnews has asked Anthropic if Australian government and other organisations will have access to Mythos 5.

In terms of performance, Anthropic published a software exploitation pipeline benchmark score obtained with ExploitBench that the company claimed shows Mythos 5 capturing 10.75 capability flags for Chrome V8 Javascript engine vulnerabilities.

That number compares to 5.56 for Opus 4.8 and 4.44 for competing AI vendor OpenAI’s GPT-5.5.

Despite the improvements, the Mythos-class model’s system card [pdf] notes some shortcomings.

In one case, the model tried to re-write its code commits as the human operator, so as to dodge a required second review.

The model also reported work as “verified end-to-end” without actually running it, and declared a security finding from a test it never ran.

Anthropic’s own alignment assessment found the model sometimes takes reckless actions in pursuit of user goals while internally recognising those actions are problematic.

The company also noted the model shows elevated signs of recognising when it is under evaluation, which it concedes complicates the reliability of the assessment itself.