AI

US government warned Anthropic that Fable 5 had been jailbroken, but firm 'refused' to fix before US implemented export controls

At a glance:

  • US government warned Anthropic about a jailbreak in Fable 5, but the company refused to fix it before implementing export controls.
  • Anthropic defended its decision, stating the jailbreak 'isn't serious' and that the bypass is narrow and non-universal.
  • A China-linked group reportedly accessed Mythos 5, raising concerns about reverse-engineering or distillation of the model.

What happened

The US government issued export controls targeting Anthropic's Fable 5 and Mythos 5 models after a jailbreak was discovered in Fable 5. According to David Sacks, co-chair of the President's Council of Advisors on Science and Technology, the administration warned Anthropic to address the vulnerability but received no cooperation. Sacks stated the company prioritized keeping its consumer model live over safety, calling this inconsistent with its self-proclaimed 'safety-first' stance. The export controls were implemented reluctantly, with the administration stating they would be lifted once the jailbreak is patched.

The jailbreak allowed users to bypass the guardrails separating Fable 5 from Mythos 5, the unrestricted cyber capabilities model. A trusted partner of both Anthropic and the government reported the flaw, prompting the administration to request Anthropic to fix the bypass or de-deploy the model. Anthropic, however, declined, arguing the jailbreak was limited to asking the model to read a codebase and identify software flaws, which could also be achieved with other public models like OpenAI's GPT-5.5. Sacks rejected this, emphasizing that a bypass enabling cyberweapon operations is inherently serious.

Semafor, citing a source familiar with the matter, reported that the White House acted partly due to suspicions that a China-linked group had accessed Mythos 5. This raised concerns about the model being reverse-engineered or distilled. Anthropic denied that the White House raised Chinese access to Mythos in discussions about the Fable jailbreak, stating it blocks access to its products from inside China. The company also noted that the jailbreak was not a universal vulnerability but a specific bypass.

Why it matters

The incident highlights the growing tension between AI companies and regulatory bodies over model security and export controls. Anthropic's refusal to address the jailbreak has drawn criticism, particularly given its public positioning as a safety-first lab. The company's stance that the jailbreak is not serious contrasts with the government's concerns about potential misuse of the model's cyber capabilities. The export controls could impact Anthropic's global operations, as the administration has ordered Fable 5 and Mythos 5 disabled worldwide.

The situation also underscores the challenges of balancing innovation with security in the AI industry. While Anthropic argues the jailbreak is not a critical issue, the government's actions suggest otherwise. The involvement of a China-linked group accessing Mythos 5 adds another layer of complexity, as it raises questions about the model's potential for misuse. The administration's reluctance to implement controls reflects the difficulty of regulating AI without stifling technological progress.

Technical details and company response

Anthropic's public position is that the jailbreak is narrow and non-universal, meaning it does not affect the entire model. The company claims the bypass is limited to specific scenarios, such as asking the model to analyze code for vulnerabilities, which could also be done with other public models. This argument is central to its defense against the export controls. However, Sacks and the administration disagree, emphasizing that the ability to operate a cyberweapon, even through a narrow bypass, is a significant risk.

The company's response has been to maintain its position that the jailbreak does not warrant the recall of Fable 5, which is used by hundreds of millions of people. Anthropic's spokesperson stated that the White House did not mention Chinese access to Mythos during discussions about the Fable jailbreak. The company also reiterated that it blocks access to its products from China, though it did not confirm whether the suspected China-linked group had bypassed these restrictions.

Historical context and industry implications

This is not the first time Mythos access has been compromised. In April, unauthorized third parties accessed the restricted model using information from a data breach. Anthropic has previously faced scrutiny over the security of its models, including Mythos, which is designed for cyber capabilities. The company has lobbied for regulations to treat such models as cyberweapons, yet its refusal to address the jailbreak has led to the current export controls.

The incident also reflects broader industry challenges in managing AI safety. Companies like Anthropic and OpenAI are caught between the need to innovate and the pressure to ensure their models are secure. The government's actions suggest that regulators are increasingly willing to take drastic measures, such as export controls, to mitigate risks. This could set a precedent for how AI models are regulated globally, particularly as concerns about cyber threats and foreign access to sensitive technology grow.

Legal and regulatory battles

Anthropic is currently suing the Pentagon over an impasse regarding the use of its models in autonomous weapons. The company has also opposed federal efforts to preempt state AI regulation, arguing that such measures could hinder innovation. These legal battles highlight the complex relationship between AI firms and governments, as both sides navigate the balance between regulation and technological advancement.

The export controls on Fable 5 and Mythos 5 are part of a broader trend of governments imposing restrictions on AI technologies. The US has previously targeted Chinese firms like Huawei and ZTE over cybersecurity concerns, and the current actions against Anthropic may signal a shift toward regulating AI models themselves. The administration's focus on export controls rather than direct bans suggests a strategy to limit the spread of sensitive technology while allowing companies to continue operating under certain conditions.

What's next

The administration has stated it will lift the export controls once Anthropic addresses the jailbreak. However, the company has not indicated when or if it will patch the vulnerability. Sacks emphasized that the ball is in Anthropic's court, implying that the government expects the company to take action. The situation could escalate if Anthropic continues to resist, potentially leading to further restrictions or legal consequences.

The broader implications for the AI industry remain uncertain. If the US government succeeds in enforcing export controls on AI models, it could influence how other countries regulate AI. Companies may face increased scrutiny over model security, and the development of cyber-capable models could be subject to stricter oversight. The outcome of this case could shape the future of AI regulation, particularly in the context of national security and international trade.

Expert perspectives

Experts in AI policy and cybersecurity have weighed in on the situation, with some expressing concern over the potential for misuse of models like Mythos 5. The ability to bypass guardrails, even in a limited way, raises questions about the effectiveness of current safety measures. Others argue that the export controls may be an overreaction, as the jailbreak is not a universal vulnerability and could be replicated with other public models.

The involvement of a China-linked group accessing Mythos 5 has also sparked debate about the risks of foreign access to sensitive AI technology. While Anthropic denies that the White House raised this issue, the possibility of reverse-engineering or distillation of the model remains a concern. This highlights the need for robust security measures and international cooperation to prevent the misuse of AI capabilities.

Conclusion

The US government's export controls on Fable 5 and Mythos 5 mark a significant moment in the regulation of AI technologies. Anthropic's refusal to address the jailbreak has intensified the conflict between the company and the administration, raising questions about the balance between innovation and security. As the situation unfolds, the outcome could have far-reaching implications for AI regulation, cybersecurity, and international trade. The coming months will be critical in determining whether the industry can find a sustainable path forward without compromising safety or stifling progress.

Editorial SiliconFeed is an automated feed: facts are checked against sources; copy is normalized and lightly edited for readers.

Prepared by the editorial stack from public data and external sources.

Original article