Have Mythos' cyber capabilities been overstated?
Rob Wiblin reacts to the claim that he was too alarmist about Anthropic's Mythos in his recent explainer.
Some have argued that the press, as well as us here at 80,000 Hours, have been confused by Anthropic and as a result giving folks an exaggerated impression of how significant the ‘Mythos moment’ is for cybersecurity — for instance our explainer video which we posted a few days after the model’s announcement.
Are they right? This is a quick reaction from The 80,000 Hours Podcast host Rob Wiblin, originally posted on X and lightly edited for clarity.
Have Mythos' cyber capabilities been overstated? I'm 50/50 on that.
Here's the case against it:
The less impressive benchmarks people like Matthew are referring to don’t attempt to capture what really worried people: the ability to identify serious novel zero-days in key software, something for which we don’t have good benchmarks.
These lines might indeed be straight but we know straight lines on graphs don't preclude huge jumps in practical impact.
Even if Mythos isn't such a big jump on other models, arguably we were sleeping on AI and cyber earlier this year and Mythos is the thing that woke us up to things that were already becoming true. But what matters in determining the necessary social response is its absolute ability and the trend we're on, not its relative ability.
Firefox security fixes jumped from 25 in January up to 423 in April. A wild increase. This suggests AI is having a huge impact. (For ordinary people it's irrelevant how much is uniquely down to Mythos. But eyeballing it: ~25/month baseline in 2025 → ~65/month in February & March with the harness + Opus 4.6 (and Anthropic's red team) → 271 bugs from the pipeline + Mythos in April, minus the ~37 that weren't Mythos. So both the harness and Mythos clearly look like large independent factors.)
I don't think anyone thought OpenAI would be far behind Anthropic and Mythos, so it could just be that both models have very powerful cyber capabilities and that's why they look similar on these graphs. Though weirdly Anthropic actively claims Mythos finds lots of new severe vulnerabilities all over the place, while OpenAI doesn't claim this of GPT-5.5. It could be that OpenAI simply hasn't tried much, but if I were them I would have tried for marketing reasons, so my guess is that GPT-5.5 is less impressive. But if Mythos and GPT-5.5 are equally good, if anything that reinforces our worries and calls for a stronger response, as it implies faster proliferation of these capabilities.
Per AISLE [1] [2], it seems like existing models with more compute, scaffolding, direction, and so on can also find many serious zero-days. We can't yet say whether they are the same as the undisclosed Mythos ones — the vulnerabilities that are harder to patch and so probably also tougher to find. But again that's irrelevant to the strategic picture, or if anything should make us more worried.
The absence of visible disasters so far cited by Matthew could be because guardrails are working, most breaches are still stealthy, inertia by bad actors, or Mythos being a big jump on everyone else after all. To me it's not very strong evidence that Mythos' capabilities have been exaggerated.
Netsec, government, finance, and software firms who are being given more info than we have access to seem to be speaking and acting as if Anthropic's claims are legit (that their software or networks are vulnerable). Maybe it's nondisclosure or wanting to stay on Anthropic's good side. Or maybe they see that Anthropic is correct. To be determined!
It's worth keeping in mind that the truth is going to come out either way here. Anthropic published SHA-3 hashes of unreleased vulnerabilities. Making preregistered falsifiable claims suggest confidence on their part. If they can't back them up their credibility will take a deserved hit.
This is less about zero-day vulnerability discovery in particular, but UK AISI's analysis of Mythos was broadly consistent with Anthropic's claims, if more measured (especially regarding ability to break into hardened systems).
Matthew is probably right that broad access to Mythos along with trigger-happy guardrails would have been fine (and China or North Korea wouldn’t be able to meaningfully take advantage of it). But that may be due to effective guardrails much more than Mythos not being useful. I endorse Anthropic widening access as soon as practical.
I don't personally care how much of a jump Mythos is in particular, but for those who do we did get a few direct comparisons: Opus 4.6's two fully working Firefox 147 exploits vs Mythos's 181. The 27-year-old OpenBSD bug in heavily reviewed code naturally made people sit up and take notice. Maybe others could have found it... but they didn't! Maybe that one will turn out to be a very unrepresentative case, maybe it'll be typical. We'll have to wait and see.
All that said, with more time to look into this I wouldn't write the line “an AI that can break into almost any computer on Earth” now, and wish I hadn't said it in April. I'm new to doing such rapid-turnaround pieces and have some lessons to learn.
The justification for the claim in my mind was:
Mythos can seemingly find major zero-days in all the key browsers and operating systems tried.
This seems similar to the capabilities of top state cyber groups. Plus Mythos could help speed up spearfishing, social engineering, searching out weaknesses in network setups, and other drudge-work.
Top state cyber groups (and probably non-state) can break into any normal individual or company device or network, with effort. So probably Mythos + time/compute + some handholding from Anthropic, could do the same, assuming Anthropic had exclusive access to Mythos.
But I think the claim was overstated:
By design iOS is very hardened. Pegasus and equivalents do break into iPhones remotely, but only the best state-tier actors have ever demonstrated that capability and it has gotten harder over time. “Almost any computer” might fail on that ground alone.
Anthropic hasn't given us enough info to confirm the ability to break into all of patched Windows, Mac, or Linux (including hardened servers). That ability would certainly be consistent with what they've shared, but it's not necessarily implied.
Mythos may indeed be able to find major vulnerabilities in all of these systems, but not be able to string enough together to truly “break in.” That's still to be determined.
(In my mind “almost any” was meant to exclude air-gapped networks, and the most hardened or monitored networks run by military, cyber, essential infrastructure or intelligence folks.)
Though I was mistaken to say that, I think my vibe in the piece — that AI has reached the point that it’s a huge deal for cybersecurity and demands a major response from a wide range of actors — was sound.
Overall I’m looking forward to much better computer security as a result of AI once we get past this intermediate period.
We’re trying out posting some scrappier quick takes like this in this Substack — let us know what you think in the comments!



As someone on the detection and response side of things, the biggest issue right now is the collapse of time to exploitation.
Everyone outside of security is just hearing that models like Mythos can soon exploit any system in the world. The reality is that it doesn’t even have to do that for there be serious consequences.
Automation of parts or entire sections of the kill chain allow threat actors to reduce the window for detection and response allowing them to take action on objectives faster than we can react. Especially for teams overwhelmed by alert fatigue, are understaffed, and/or don’t have the resources available to adapt AI systems into their detection workflows.
Not to mention the entire financial aspect of threat actors automating their own workflows.