Someone Built an Open-Source 'Theoretical Mythos' to Reverse-Engineer Anthropic's Most Dangerous AI

OpenMythos is a from-scratch attempt to reconstruct the architecture behind Claude Mythos, the cyber-capable model Anthropic refuses to release. It's speculation in code form.

By Jose Antonio Lanz

Edited by Guillermo Jimenez

May 4, 2026

4 min read

Anthropic. Image: Decrypt/Shutterstock

Add on Google

In brief

OpenMythos is a from-scratch reconstruction of the Claude Mythos architecture, built only from public research papers and educated guesses.
Claude Mythos is Anthropic's most powerful model, locked away in Project Glasswing because it autonomously found 271 Firefox vulnerabilities and 32-step network attacks.
The repo is theoretical scaffolding—code without trained weights. It mirrors a separate effort by Vidoc Security that reproduced Mythos's vulnerability findings using off-the-shelf models.

If Anthropic won't show you what's inside its most dangerous AI, somebody on GitHub will guess.

A developer named Kye Gomez has published OpenMythos, an open-source reconstruction of what he thinks Claude Mythos looks like under the hood. The repo has picked up over 10,000 GitHub stars in a few weeks upon release, and ships with an exhaustive “readme” file full of equations, citations, and a polite disclaimer that it has nothing to do with Anthropic.

It's speculation. But it's structured speculation, in code.

Here’s a quick refresher on what Mythos is: Mythos leaked into public view in late March, when Anthropic accidentally published draft materials describing it as the company's most capable model to date—a tier above Opus. The follow-up, Mythos Preview, turned out to be unreleasably good at cybersecurity.

Per Anthropic, Mythos found 271 vulnerabilities in Firefox during Mozilla testing. It became the first AI model to complete a 32-step corporate network attack simulation. Anthropic locked it inside Project Glasswing, a vetted coalition of about 40 partners, including Microsoft, Apple, Amazon, and the NSA.

The public never gets to touch it. So Gomez tried to figure out how it works.

OpenMythos's central guess is that Mythos is a Recurrent-Depth Transformer—also called a looped transformer. Standard models stack hundreds of unique layers. Looped models take a smaller stack and run it through itself many times per forward pass.

In other words, it’s the same weights going through more iterations. Deeper thinking, in continuous latent space, before any token gets emitted.

The repo argues this would explain Mythos's two strangest qualities: It reasons through novel problems no other model can crack, but its raw memorization is uneven. That's the architectural fingerprint of looping—composition over storage.

OpenMythos cites Parcae, an April 2026 paper from University of California San Diego and Together AI that solved the long-standing instability problem in looped models—a 770 million-parameter Parcae model matches a 1.3 billion fixed-depth transformer on quality, with predictable scaling laws for how many loops to run. The repo also borrows DeepSeek's Multi-Latent Attention to compress memory, and a Mixture-of-Experts setup to handle breadth across domains.

What it does not have is weights, so basically it’s a technique without an executor.

OpenMythos is theoretical. The code defines model variants from 1 billion to 1 trillion parameters, but you have to train them yourself—the readme file points to a 3 billion parameter training script on FineWeb-Edu and a Chinchilla-adjusted 30 billion-token target, which is the kind of compute bill that runs into hundreds of thousands of dollars on H100s. Nobody's done it yet.

So why does it matter?

Because it's the second time in a month somebody has chipped at the wall around Mythos. The first was a study from Vidoc Security, which reproduced several of Mythos's most alarming vulnerability findings using GPT-5.4 and Claude Opus 4.6 inside an open-source agent. No Glasswing access, and at under $30 per scan. Different angle, same conclusion: The moat around Mythos may be thinner than the marketing suggested.

OpenMythos and the Vidoc replication are doing different jobs. Vidoc reproduced Mythos's outputs—the vulnerability discoveries themselves—using existing models. OpenMythos is trying to reproduce the architecture—the actual machine that produces those outputs. One says you don't need Mythos to find the bugs Mythos found. The other says, eventually, you might be able to build something like Mythos yourself.

Anthropic almost certainly doesn't share Gomez's architectural guesses publicly, and several of the design choices in OpenMythos are explicit hedges—the readme file makes sure to be vague enough so users know this is just an approach. It repeatedly says "likely," "suspected," and "almost certainly." Real Mythos may not be a looped transformer at all. Or it might be one with details Gomez hasn't reverse-engineered yet.

What OpenMythos demonstrates is that the research literature already contains most of the pieces. Looped transformers, Mixture of Experts, Multi-Latent Attention, Adaptive Computation Time, Parcae's stability fix—none of it is proprietary. The repo is, more than anything, an inventory of what's publicly known about how to build a Mythos-class model.

The repo is licensed MIT, and it has 2,700 forks already. The training script is sitting there, waiting for someone with a GPU cluster and a thesis to prove.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.

Coin Prices