Anthropic’s Mythos is evolving faster than expected, reports AI safety agency


aiburst-gettyimages-2189115060

Eugene Mymrin/ Moment via Getty Images

Follow ZDNET: Add us as a preferred source on Google.


ZDNET’s key takeaways

  • The latest version of Claude Mythos has already advanced.
  • External researchers found that it achieved several firsts in testing. 
  • AI capabilities may be improving much faster than anticipated. 

Anthropic’s Claude Mythos, which the company maintains is too powerful to be released generally, already appears to have gained new capabilities. 

In a blog post on Wednesday, the UK AI Security Institute (AISI) reported that it had tested a newer version of Mythos, which outperformed both its earlier results and OpenAI’s GPT-5.5 — just a month after Mythos’ initial release. 

Also: Apple, Google, and Microsoft join Anthropic’s Project Glasswing to defend world’s most critical software

“The newer Mythos Preview checkpoint completed both our cyber ranges, solving the range ‘The Last Ones’ in 6 of 10 attempts and the previously unsolved ‘Cooling Tower’ in 3 of 10 attempts,” the blog authors wrote. “This was the first time that a model completed the second of our two cyber ranges.” 

When Anthropic first announced Mythos Preview and Project Glasswing — the cybersecurity testing alliance it formed with rival tech companies and AI labs, to which it gave limited access to Mythos — last month, UK AISI evaluated it, finding that the model “represents a step up over previous frontier models in a landscape where cyber performance was already rapidly improving.” 

That third-party perspective helped balance claims that the hype around Mythos was either solely marketing or, at the other end, signaled a catastrophic shift in AI capabilities. The truth about what the model can do is likely somewhere in the middle. 

Also: How to learn Claude Code for free with Anthropic’s AI courses – one took me just 20 minutes

AISI’s updated test also exemplifies that capability improvements aren’t restricted to individual model releases, but can happen within versions of a single model. 

A rapidly accelerating cyber threat 

AISI noted that AI models are rapidly advancing in their ability to handle cyber tasks, with serious implications for cybersecurity, especially given Mythos’ knack for detecting software vulnerabilities

“In February 2026, we internally estimated that the length of cyber tasks AI models could complete had doubled every 4.7 months since late 2024 – already an acceleration from our November 2025 estimate of 8 months,” the blog authors wrote. “Since then, AISI reported on two new models, Claude Mythos Preview and [OpenAI’s] GPT-5.5, which substantially exceeded both doubling rate trends.” 

Also: The third major Linux kernel flaw in two weeks has been found – thanks to AI

The authors added that it’s unclear whether that trend will hold or whether these findings indicate a lasting increase. Mythos and GPT-5.5 could simply be notable breaks from the overall pattern of model evolution. 

Still, AISI clarified that there are several unknowns its testing could not account for. The tests capped tasks at 2.5 million tokens, which let researchers better compare performance results over time. That inherently “understates what frontier models can do,” they wrote. 

“Mythos Preview and GPT-5.5 have large upper-bound error bars due to near-100% success rates on our narrow cyber suite’s longest tasks, even with the 2.5M token limit,” the blog continued. “Our tasks are also not long enough to determine how sharply the models’ reliability would deteriorate at higher task lengths. This places some of the latest models at the limit of what our narrow test suite can measure.”

Also: I put GPT-5.5 through a 10-round test: It scored 93/100, losing points only for exuberance

While this makes the point of model failure hard to measure, it also means model success rates on these tasks would be much higher without the token cap — so high, in fact, that “time horizons become impossible to calculate.” Models with more token access and complex agent infrastructure would be much more capable. 

“A 2.5M token limit is relatively low — in our cyber range experiment we use up to 100M tokens and find performance would likely still improve beyond that budget, especially for recent models, which disproportionately benefit from higher token limits,” the blog added. 





Source link

Leave a Reply

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Recent Reviews







We may receive a commission on purchases made from links.

A toolkit can go a long way toward helping you stock up on essentials. All of the major tool brands offer different kinds, including the longstanding power and hand tool favorite, Craftsman. Its products can be found in many online stores, and Amazon is currently holding a major sale. At the time of publication, a 262-piece Craftsman hand tool set is on a massive markdown of 40% off, saving you $100 at checkout.

The collection currently costs $149, which is still a lot of money, but is a big budgetary improvement over the $249 regular price. With the discount, you’re getting more for your dollar, and this kit includes 118 sockets, three ratchets to use them on, 24 wrenches, 44 hex keys, 66 specialty bits, and seven extra accessories. The set comes in a three-drawer, handled toolbox that’s part of the Craftsman VersaStack modular storage system. You’re also getting a full lifetime warranty.

While the price and quantity of tools may seem right, what is there to say about the quality of this Craftsman kit? According to most customers, it’s a worthwhile buy for any DIYer, even without a huge sale to sweeten the deal.

How online buyers feel about this Craftsman tool kit

On Amazon, the response to this specific Craftsman tool kit has been resoundingly positive. There are currently close to 300 reviews discussing this specific variant and over 10,000 for the overall product, and the vast majority gave it five stars. Most found the quality of the tools and sockets more than up to par, the VersaStack toolbox sturdy and great for organization and protection, and the price adequate. Still, it’s often said this is more of a beginner or around-the-house kind of kit, so if you want it for professional use, it’s not considered the strongest option on the market.

Elsewhere online, this kit has continued to garner largely positive press. On the Craftsman website itself, almost all of the 18 reviews gave it five stars. Some applauded the functionality of the VersaStack case and its ability to connect to other boxes in the line, while others praised the versatility of the tools within. Meanwhile, most of the 289 reviews on Lowe’s website are five-star and approve of the kit’s piece selection and durability.

There are a lot of great mechanic tool sets for anyone’s budget, and there could be a case to make that this Craftsman 262-piece set belongs alongside them. At its sale price or otherwise, it’s a hit across several retailers. Whether it’s the right set for you and your hand tool-related needs, though, is a question only you can answer. 





Source link