An Anthropic spokesperson confirmed that the new model represents "a step change" in AI performance and is "the most capable we’ve built to date." Currently being trialed by "early access customers," this next-generation AI is poised to significantly advance capabilities in areas like reasoning, coding, and, notably, cybersecurity. However, the leak itself, a result of "human error" in content management system (CMS) configuration, has inadvertently cast a shadow over the company’s meticulous approach to AI safety and responsible deployment.
The existence and detailed descriptions of the model, along with other sensitive internal documents, were inadvertently stored in a publicly-accessible data cache. These materials, including what appeared to be a draft blog post, were reviewed by Fortune and subsequently analyzed by cybersecurity researchers Roy Paz of LayerX Security and Alexandre Pauwels of the University of Cambridge. Their findings revealed close to 3,000 assets linked to Anthropic’s blog that had not been published, yet were publicly discoverable. Upon being notified by Fortune on Thursday, Anthropic promptly rectified the security lapse, removing public access to the data store.
The leaked draft blog post, available in an unsecured and publicly-searchable data store prior to its removal, explicitly named the new model "Claude Mythos" and described a new tier of AI models to be branded "Capybara." The document stated, "’Capybara’ is a new name for a new tier of model: larger and more intelligent than our Opus models—which were, until now, our most powerful." It appears that "Capybara" and "Mythos" refer to the same underlying, highly advanced model, signaling Anthropic’s intention to introduce a significantly more potent offering than its current flagship, Claude Opus.
Anthropic, founded by former OpenAI researchers who prioritized AI safety, currently organizes its models into three tiers: Opus (largest, most capable), Sonnet (faster, cheaper, less capable), and Haiku (smallest, cheapest, fastest). The leaked information indicates that the Capybara tier will sit above Opus, representing a new pinnacle of performance but also a higher operational cost. The draft blog highlighted Capybara’s superior performance, stating, "Compared to our previous best model, Claude Opus 4.6, Capybara gets dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity, among others." This suggests a quantum leap in cognitive and practical capabilities, pushing the boundaries of what large language models can achieve.
In response to inquiries about the leaked draft, Anthropic acknowledged the development and testing of this advanced model. A spokesperson confirmed, "We’re developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity. Given the strength of its capabilities, we’re being deliberate about how we release it. As is standard practice across the industry, we’re working with a small group of early access customers to test the model. We consider this model a step change and the most capable we’ve built to date." The company’s emphasis on a "deliberate" release strategy underscores its awareness of the model’s power and potential implications. The leaked document itself outlined a cautious rollout, beginning with a select group of early-access users, noting the model’s high running costs and its unsuitability for general release at this stage.
Significant New Cybersecurity Risks: A Double-Edged Sword
Perhaps the most striking revelation from the leaked documents concerns the new AI model’s unprecedented cybersecurity capabilities and the corresponding risks it poses. The draft blog post explicitly stated, "In preparing to release Claude Capybara, we want to act with extra caution and understand the risks it poses—even beyond what we learn in our own testing. In particular, we want to understand the model’s potential near-term risks in the realm of cybersecurity—and share the results to help cyber defenders prepare." This statement highlights Anthropic’s deep concern about the dual-use nature of such advanced AI.
The company’s internal assessment is stark: the system is "currently far ahead of any other AI model in cyber capabilities" and "it presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders." This translates into a profound worry that malicious actors could leverage Capybara/Mythos to execute large-scale, sophisticated cyberattacks, potentially overwhelming existing defensive mechanisms. The irony of a company built on AI safety principles grappling with a model that could accelerate cyber threats is not lost on observers.
Consequently, Anthropic’s planned release strategy for this model is reportedly focused on empowering cyber defenders. "We’re releasing it in early access to organizations, giving them a head start in improving the robustness of their codebases against the impending wave of AI-driven exploits," the draft blog post revealed. This proactive approach aims to equip cybersecurity professionals with the very tools that could be turned against them, allowing them to harden their defenses before widespread malicious exploitation becomes feasible.
This isn’t the first time a frontier AI model has raised such alarms. The latest generation of models from both Anthropic and its primary competitor, OpenAI, have crossed thresholds that necessitate new levels of caution. In February, OpenAI released GPT-5.3-Codex, which it classified as its first "high capability" model for cybersecurity-related tasks under its Preparedness Framework. OpenAI explicitly stated it had directly trained this model to identify software vulnerabilities, acknowledging its potent dual-use nature. Similarly, Anthropic’s Opus 4.6, released around the same time, also demonstrated an ability to surface previously unknown vulnerabilities in production codebases, a capability the company recognized could be used by both ethical hackers and malicious actors.
Anthropic has also encountered real-world instances of AI-driven cyberattacks. In a previously reported incident, the company disrupted what it described as the first documented large-scale AI cyberattack, involving hacking groups, including those linked to the Chinese government. These groups reportedly attempted to exploit Claude in real-world cyberattacks, with one case revealing a coordinated campaign using Claude Code to infiltrate approximately 30 organizations, including tech companies, financial institutions, and government agencies. Anthropic’s swift detection, investigation, account bans, and notification of affected organizations underscored the tangible and immediate risks posed by powerful AI in the hands of sophisticated adversaries. The development of Capybara/Mythos, with its even greater capabilities, amplifies these concerns significantly.
An Exclusive Executive Retreat Exposed
Beyond the technical revelations about its next-gen AI, the data leak also exposed details of an exclusive, invite-only CEO summit for European business leaders. This event is a crucial part of Anthropic’s strategy to expand its enterprise customer base and cement its position in the competitive AI market.
The leak stemmed from what cybersecurity professionals described as an error in the configuration of Anthropic’s content management system (CMS). Digital assets created within the CMS were, by default, set to public and assigned publicly accessible URLs upon upload, unless a user explicitly altered the privacy settings. This oversight led to a vast cache of images, PDF files, and audio files being erroneously published to an unsecured and publicly-accessible URL. Anthropic confirmed this in a statement to Fortune, attributing the "issue with one of our external CMS tools" to "human error."
While many of the leaked documents were discarded or unused assets for past blog posts, such as images and logos, several appeared to be private or internal documents. One asset, for example, bore a title referencing an employee’s "parental leave," indicating the scope of internal information inadvertently exposed.
Crucially, the documents included a PDF outlining an upcoming, invite-only retreat for the CEOs of European companies, scheduled to be held in the U.K. Anthropic CEO Dario Amodei is slated to attend this two-day event, described as an "intimate gathering" for "thoughtful conversation." While the names of other attendees were not listed, they were characterized as "Europe’s most influential business leaders." The retreat is planned for an 18th-century manor-turned-hotel-and-spa in the English countryside, promising attendees insights from lawmakers and policymakers on AI adoption, alongside exclusive demonstrations of unreleased Claude capabilities. This event clearly aims to foster high-level relationships and drive enterprise adoption of Anthropic’s AI models. An Anthropic spokesperson confirmed the event, stating it "is part of an ongoing series of events we’ve hosted over the past year. We look forward to hosting European business leaders to discuss the future of AI."
Broader Implications for AI Development and Security
The Anthropic data leak serves as a stark reminder of the multifaceted challenges facing the rapidly evolving AI industry. On one hand, it highlights the relentless pace of innovation, with companies like Anthropic pushing the boundaries of what AI can do, even while acknowledging the profound risks. The "step change" represented by Claude Mythos/Capybara suggests that the industry is still in its early stages of discovery, with increasingly powerful models emerging at a dizzying rate.
On the other hand, the incident underscores the critical importance of robust internal security protocols and data governance, particularly for organizations handling highly sensitive, unreleased intellectual property and strategic corporate plans. The "human error" explanation, while common, points to a systemic vulnerability that can have significant consequences, from reputational damage to competitive intelligence leaks. In an industry where technological breakthroughs are guarded secrets and the race for market leadership is fierce, such lapses can be particularly costly.
The leak also adds another layer to the ongoing public discourse surrounding AI safety and responsible development. Anthropic, a prominent voice in the "effective altruism" movement within AI, has consistently advocated for a cautious approach to general intelligence. The exposure of its internal concerns about its own model’s cybersecurity risks, coupled with the accidental public disclosure, creates a complex narrative. It reinforces the idea that even the most safety-conscious organizations are navigating uncharted waters, grappling with the immense power they are unleashing.
As AI models become more capable, the "dual-use" dilemma—the potential for technology to be used for both beneficial and malicious purposes—will only intensify. The proactive approach Anthropic plans to take by engaging cyber defenders early is a testament to this challenge. However, the leak itself demonstrates the difficulty of controlling information and technology even within a company’s own ecosystem, raising questions about the broader control and oversight mechanisms needed as these powerful AIs become more widespread.
Ultimately, the Anthropic data leak is more than just a security incident; it’s a window into the intense pressures, strategic maneuvers, and profound ethical considerations at the forefront of AI development. It signals an era where technical prowess must be matched by an equally rigorous commitment to security, transparency, and responsible governance, not just in the AI models themselves, but in the very operations of the companies building them. The race for AI supremacy continues, but this incident serves as a potent reminder that the journey is fraught with both immense promise and significant peril.

