Anthropic’s Mythos is evolving faster than expected, reports AI safety agency


aiburst-gettyimages-2189115060

Eugene Mymrin/ Moment via Getty Images

Follow ZDNET: Add us as a preferred source on Google.


ZDNET’s key takeaways

  • The latest version of Claude Mythos has already advanced.
  • External researchers found that it achieved several firsts in testing. 
  • AI capabilities may be improving much faster than anticipated. 

Anthropic’s Claude Mythos, which the company maintains is too powerful to be released generally, already appears to have gained new capabilities. 

In a blog post on Wednesday, the UK AI Security Institute (AISI) reported that it had tested a newer version of Mythos, which outperformed both its earlier results and OpenAI’s GPT-5.5 — just a month after Mythos’ initial release. 

Also: Apple, Google, and Microsoft join Anthropic’s Project Glasswing to defend world’s most critical software

“The newer Mythos Preview checkpoint completed both our cyber ranges, solving the range ‘The Last Ones’ in 6 of 10 attempts and the previously unsolved ‘Cooling Tower’ in 3 of 10 attempts,” the blog authors wrote. “This was the first time that a model completed the second of our two cyber ranges.” 

When Anthropic first announced Mythos Preview and Project Glasswing — the cybersecurity testing alliance it formed with rival tech companies and AI labs, to which it gave limited access to Mythos — last month, UK AISI evaluated it, finding that the model “represents a step up over previous frontier models in a landscape where cyber performance was already rapidly improving.” 

That third-party perspective helped balance claims that the hype around Mythos was either solely marketing or, at the other end, signaled a catastrophic shift in AI capabilities. The truth about what the model can do is likely somewhere in the middle. 

Also: How to learn Claude Code for free with Anthropic’s AI courses – one took me just 20 minutes

AISI’s updated test also exemplifies that capability improvements aren’t restricted to individual model releases, but can happen within versions of a single model. 

A rapidly accelerating cyber threat 

AISI noted that AI models are rapidly advancing in their ability to handle cyber tasks, with serious implications for cybersecurity, especially given Mythos’ knack for detecting software vulnerabilities

“In February 2026, we internally estimated that the length of cyber tasks AI models could complete had doubled every 4.7 months since late 2024 – already an acceleration from our November 2025 estimate of 8 months,” the blog authors wrote. “Since then, AISI reported on two new models, Claude Mythos Preview and [OpenAI’s] GPT-5.5, which substantially exceeded both doubling rate trends.” 

Also: The third major Linux kernel flaw in two weeks has been found – thanks to AI

The authors added that it’s unclear whether that trend will hold or whether these findings indicate a lasting increase. Mythos and GPT-5.5 could simply be notable breaks from the overall pattern of model evolution. 

Still, AISI clarified that there are several unknowns its testing could not account for. The tests capped tasks at 2.5 million tokens, which let researchers better compare performance results over time. That inherently “understates what frontier models can do,” they wrote. 

“Mythos Preview and GPT-5.5 have large upper-bound error bars due to near-100% success rates on our narrow cyber suite’s longest tasks, even with the 2.5M token limit,” the blog continued. “Our tasks are also not long enough to determine how sharply the models’ reliability would deteriorate at higher task lengths. This places some of the latest models at the limit of what our narrow test suite can measure.”

Also: I put GPT-5.5 through a 10-round test: It scored 93/100, losing points only for exuberance

While this makes the point of model failure hard to measure, it also means model success rates on these tasks would be much higher without the token cap — so high, in fact, that “time horizons become impossible to calculate.” Models with more token access and complex agent infrastructure would be much more capable. 

“A 2.5M token limit is relatively low — in our cyber range experiment we use up to 100M tokens and find performance would likely still improve beyond that budget, especially for recent models, which disproportionately benefit from higher token limits,” the blog added. 





Source link

Leave a Reply

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Recent Reviews


If Game Two of their first-round playoff series with the Denver Nuggets saved the 2025-26 season for the Minnesota Timberwolves, Game Three showed why it should be saved. 

The Timberwolves were a different beast while decisively thumping the Nuggets, 113-96 Thursday night at Target Center, in a game that wasn’t nearly that close. These Wolves were the mythical creature we’d heard about in preseason lore, purposefully locked and loaded to be both marauding and staunch. They owned both ends of the court, gleefully transferring back and forth from irresistible force to immovable object. 

A quartet of Timberwolves deserve special mention, but it begins with Jaden McDaniels. After his team had toppled Denver to even the series at a game apiece Monday night, McDaniels used the sizable chip on his shoulder to etch some graffiti into the public discourse, casually castigating the most prominent Nuggets players by name as “bad defenders” in a matter-of-fact manner that had the media compelling him to confirm what he had just said. 

Trash talk is fleetingly fungible in the jaundiced social environment of 2026, functioning more like coupons than currency in that it needs to be rapidly leveraged before its expiration date. The common perception naturally was that McDaniels was calling out the Nuggets. But in a more subtle, profound way, he was also putting his teammates on notice. 

All season long the Timberwolves have procrastinated on their full potential, frequently demonstrating that their preseason talk about maturity and commitment was cheap. By contrast, those words uttered by McDaniels were expensive. He had just picked a fight with the opponent, leaving open the question of how many of his teammates would join him in the fray. 

That he would lead the charge was established early, after the Timberwolves’ top two scorers, Anthony Edwards and Julius Randle, had each missed a pair of open looks against Denver’s bad defenders in the game’s first 90 seconds.  

With the game still scoreless, the NBA’s best pick-and-roll combo, Nikola Jokic and Jamal Murray, were clustered around the foul line with Minnesota’s best defenders, McDaniels and Rudy Gobert. As they jammed up Jokic, McDaniels picked the ball loose and started sprint-dribbling the other way. To no one’s surprise, Donte “Ragu” DiVincenzo was also on his horse in transition, receiving a pass from McDaniels and then lobbing it back for a Jaden slam against a hapless Murray and Murray’s late-arriving teammate, Cam Johnson, who committed the foul that allowed McDaniels to finish with the “and-1” free throw. 

On the Timberwolves next offensive possession, McDaniels muscled his way to two offensive rebounds, feeding Ragu off the first one for a missed three-pointer, which he corralled for the second one and executed the putback in traffic. It was McDaniels 5, Nuggets 0, setting the tone for a game in which not only did the Wolves never trail, but never let the lead go under double digits after McDaniels made a consecutive pair of driving layups eight minutes into the game. 

“Spectacular. I thought his activity offensively in the first quarter was outstanding,” said Wolves coach Chris Finch after the game. “He was inspirational.” 

Among the most inspired were McDaniels fellow wing players, Ragu and Ayo Dosunmu. Ragu is exactly the kind of player who will have your back in a squabble, and his galvanized performance seemed borne of satisfaction that someone else had clarified the mission. As usual, the Timberwolves were at their best with him on the court: +20 in the 32:54 he played, -3 in the 15:06 he sat. 

“He makes so many hustle plays, momentum plays, different styles of plays.” Finch raved. “He’ll make a shot, get a transition bucket, he’ll rebound, get a steal, blow something up. So many different plays. He’s just a basketball player.”

Related: How the Timberwolves sparked a season-saving Game 2 comeback over the Nuggets in Denver

Then there was Ayo, whose fearless, blazing, bee-lines for the bucket were quicksilver kryptonite for a Nuggets defense that is neither swift nor rugged. “I’ve been waiting for him to wake up a little bit in this series,” Finch accurately observed. “The downhill mindset that he played with all season for us was back.”

Back with the sort of multipurpose propulsion that leaves witnesses with giddy whiplash. Ayo led the team with 25 points and 9 assists in 32 minutes of time-lapse hoops, the lone blemish being three clanks from long range. Why chuck treys when you can so easily undress players in the paint? Ayo was 10-for-12 on two-pointers and none of those dozen shots came from anywhere but beneath the rim. Five of his nine dimes likewise yielded layups or dunks, which means he personally accounted for 30 of the 68 points in the paint by the Timberwolves on Thursday, doubling up the Nuggets’ 34.

Which brings us to the non-wing in Game 3’s ring of honor, Rudy Gobert. For the third straight game, Gobert blunted the supposed advantage Denver had with the magical playmaker Nikola Jokic at the controls. Suffice to say that in the last five quarters, Jokic has shot 8-for-33 from the floor. If that continues, the Nuggets are toast in this series. 

When I asked Finch after the game if the herculean job Gobert was doing on Jokic made planning his defense simpler and better thus far, he replied, “Rudy is making all of us look good right now with his defense.” 

Amen.

If there is an asterisk on this game, it would be the absence of Denver’s brutishly versatile power forward Aaron Gordon. Nuggets coach David Adelman should be given a lot of credit for his honesty and transparency in dealing with the media during his first full season at the helm, but it came back to bite him and his team during the pregame presser, when he was clearly rattled and dejected by the sudden unavailability of Gordon, whose playing status went to “probable” to “out” in a period of a few hours due to a chronic calf strain. 

Gordon is far and away his team’s best defender, making the timing of his injury especially troublesome in the wake of McDaniels laying down his marker. Rattled is a good way to describe the entire team’s performance in the first quarter, an emotional wounding that needs to heal as fast as Gordon’s body if the Nuggets are going to be competitive in a series that had dramatically been flipped on its head over the past three days. 

That the Timberwolves played with such dominance despite mediocre outings from Ant and Randle would be a good thing for both of those current cornerstones to keep in mind. Ant was beset by foul trouble and Randle had a solid second quarter, but it stood out that neither player fully embraced what so often works on offense when the Wolves are at their best: Push the pace, move the ball, move without the ball, and make quick decisions. Ant and Randle can still be first among equals and blend into that catechism if they stay attuned to the possibilities of a greater good, one that all of sudden doesn’t have to end with them being postseason fodder for the Spurs or the Thunder. 

Not when you’ve got three wings at a collective peak, with a chaser of Rudy semi-clowning the Joker. 



Source link