Claude Fable 5 secretly throttled AI researchers, and the internet went wild


Claude Fable 5 secretly throttled AI researchers, and the internet went wild

Elyse Betters Picaro / ZDNET

Follow ZDNET: Add us as a preferred source on Google.


ZDNET’s key takeaways

  • Fable 5’s backlash is about transparency, not raw AI power.
  • Hidden safeguards made researchers question what they were testing.
  • Cybersecurity experts warn guardrails can also block defenders.

Mythos was introduced in April as part of Project Glasswing, a partnership among top-tier tech organizations and Anthropic formed to find and fix vulnerabilities in internet infrastructure. It was restricted to only certain organizations because a tool that can find previously unknown vulnerabilities to fix them can also be used to find previously unknown vulnerabilities to exploit them.

Also: Apple, Google, and Microsoft join Anthropic’s Project Glasswing to defend world’s most critical software

Mythos and Glasswing are far more powerful than Anthropic’s Claude Security tool, which is designed to run in Opus. Still, Claude Security can scan a codebase and help find some issues. But then, earlier this week, Anthropic announced and released Fable, technically “Fable 5,” which is effectively a muzzled version of Mythos.

Anthropic was clear that Fable would not support certain risky avenues of research in cybersecurity, biology, and chemistry.

Also: Anthropic’s new Claude Security tool scans your codebase for flaws – and helps you decide what to fix first

However, some caution against trusting the safety claims too readily.

“Jailbreak-resistance claims should be viewed with appropriate caution,” she says. The results “represent a point-in-time assessment. Attackers continuously adapt,” Sally Vincent, a senior threat research engineer at Exabeam (a security analytics firm), said via email.

Still, Anthropic doesn’t want people making bioweapons in their backyards. This restriction is clear. When such requests are made, Claude downgrades from Fable to Opus-level intelligence and, crucially, tells users the downgrade is happening.

So far, so good.

But then it all went to heck

For researchers working on certain kinds of things, like super-powerful chip designs or frontier-level AI large language models, Fable was silent. As with other flagged endeavors, it downgraded models from Fable to Opus. But this time, users were not told about the downgrade. Actually, that’s an oversimplification.

Buried in the 319-page Fable and Mythos System Card, there was mention of the downgrade that would happen when working on these types of projects, stating that the behavior would not be visible to users. The user experience itself didn’t show anything. So, for users not in the habit of reading and internalizing all 319 pages, the downgrade was not displayed in any way when it happened.

Users assumed they were testing and getting results from Fable when, in fact, they were getting Opus-level results instead.

This caused a backlash. Fortune described this behavior as “secret sabotage.” Wired reported on this silent downgrade practice, also saying it could sabotage AI researchers.

Also: Why I ditched Copilot for Claude in Word, Excel, and PowerPoint – and how you can, too

Rob T. Lee is the chief AI officer and chief of research at SANS Institute (a cybersecurity training outfit). He also serves as a technical adviser to the Foreign Intelligence Surveillance Court and as a commissioner on the CSIS Commission on US Cyber Force Generation. In an email to ZDNET, he said Anthropic’s Fable 5 is “a novel solution, and a smart one, but Fable 5 will be attacked. The same layer that stops malicious use also blocks legitimate defensive research.”

His take is that the Fable restrictions block defenders from creating defenses. Lee, who formed his view after using the platform, tried to use it to build a digital forensics skill and was dropped down to Opus 4.8. “Clever way to stop malicious actors or not, it keeps new defensive capability away from the people who will build the next generation of tooling,” he said.

Lee assumes the new model has already gotten into the wrong hands because it’s happened in the past.

What I find most interesting is his perspective on the restriction of the Mythos model. It’s not the inherent capabilities of the AI, but rather the human factor.

“Even under Glasswing, access was restricted and monitored. But those organizations have thousands of employees. Any one of them could be incentivized to hand access to a criminal group, or could already be a DPRK [Democratic People’s Republic of Korea] actor sitting inside the org,” he said.

Anthropic’s response

The internet has spoken, and it got a surgical response from Anthropic.

ZDNET reached out to the company, which gave us its official response:

We’re changing Fable 5’s safeguards for frontier LLM development to make them visible.

Starting this week, flagged requests will visibly fall back to Opus 4.8. On the API, any flagged requests will return a reason for their refusal. You will see this every time it happens.

Anthropic said its current set of safeguards “covers a handful of narrow tasks like frontier-scale LLM data pipelines and kernel development for certain non-standard chips.” The company takes a pretty sharp, almost jingoistic tone I can’t really argue against. “These safeguards prevent foreign adversaries from using our most capable models in ways that pose severe safety risks,” it said.

On the other hand, while the US is leading the pack, it’s only by a nose.

I’ve been testing some of the foundation models coming out of China. For example, my OpenClaw server is running GLM-5.1, which is made by Z.ai (formerly Zhipu AI), a Tsinghua University spinoff and the first publicly traded foundation model company in China. It’s not exactly Fable 5 (or even Opus), but it’s free, and it works.

Also: How Claude Code’s new auto mode prevents AI coding disasters – without slowing you down

Regarding Fable 5’s restrictions, Anthropic said, “The US and its allies hold an edge in frontier chips and the highly optimized software that runs them at full potential. These safeguards ensure Claude isn’t used to erode that advantage — by optimizing chips developed by those adversaries, for example.”

Ashley Casovan, managing director of IAPP’s AI Governance Center (a privacy professionals association), credits Anthropic for holding Mythos back long enough to “put necessary guardrails into their software,” while noting that “we have not yet seen the impact that these models can have when released at this scale,” she said via email.

Meanwhile, Chris Boehm, field CTO at Zero Networks (a network segmentation vendor), frames the accomplishment as restraint rather than raw power: Anthropic “wrestled it into something safe enough to release widely.” The payoff, he said via email, is scale: ordinary defenders finally operating at attacker speed, “assuming the safeguards hold up, which is the thing I’ll be watching in the model card.”

Also: How to learn Claude Code for free with Anthropic’s AI courses – one took me just 20 minutes

In the for-what-it’s-worth category, Anthropic also says the restrictions “also help uphold our terms of service, which prohibit using our models to develop competing AI systems — a standard restriction across major AI providers.”

But the interesting part of the news is that Anthropic isn’t just holding the line and telling everyone to stop bothering it. It listened and apologized.

We made the wrong tradeoff and we apologize for not getting the balance right. Building these safeguards is a complex technical challenge: users may experience more false positives as we refine these classifiers to respond to new threats. We are working to reduce these as fast as possible.

I also appreciate that Anthropic shared its reasoning for its initial approach. In deciding whether to make downgrades visible or invisible, the company faced a choice. “A hidden safeguard is harder to probe and work around. This means the safeguards can be targeted much more narrowly,” a spokesperson said.

But, obviously, as we’ve seen, those hidden safeguards were found in a matter of hours.

There is some concern about false positives, which Anthropic acknowledges.

“Current usage shows that the classifier triggers on about 0.05% of tasks, affecting less than 0.05% of organizations. A visible safeguard needs to cast a wider net to be more robust, resulting in more requests being incorrectly flagged. They do not affect the vast majority of coding and ML work,” the company said.

Some, like Etay Maor, vice president of threat intelligence at Cato Networks (a security vendor), believe that the Fable 5 protections are strong enough to defend against opportunistic hackers.

Also: I tried a Claude Code rival that’s local, open source, and completely free – how it went

But “well-funded and motivated attackers” won’t give up because the challenge is hard.

“Sophisticated threat actors are not going to stop because one technique is blocked. If direct exploitation becomes harder, they’ll move to other approaches such as context manipulation, decomposition, abstraction techniques, or capability distillation,” he said in an email.

False positives, as Anthropic mentioned, are also a concern.

“When the classifier becomes too restrictive, you start running into false positives. The same controls that are designed to stop malicious activity can also prevent legitimate users from using the model for good causes,” Maor said.

The data retention issue

Another issue at play is Anthropic’s data retention policy for Mythos-class models.

According to Reuters, Anthropic’s policy of retaining prompts and responses for 30 days, more for policy-violating prompts, was enough for Microsoft to limit employee use and spin up a legal team to evaluate the policy.

But this isn’t only a Mythos- or Fable-related issue. It’s just showing up in the news at the same time as the Fable downgrade pushback. Anthropic retains data across many of its products. Most of them can be run under a zero-data-retention agreement.

Also: AI Model Release Tracker: Microsoft AI’s first reasoning model arrives

The wrinkle is that Fable and Mythos are the exceptions. Anthropic’s Covered Models under a Business Associate Agreement (BAA) page lays it out. Those two models require 30-day retention. They can’t be run with zero data retention because the safety classifiers need the data to work.

That missing off-switch, not the 30 days itself, is what reportedly triggered Microsoft’s legal team. I won’t pretend to try to parse all the options. But if you’ve got a team of lawyers and regulatory responsibility, the page listed in the previous paragraph is the one to read. In any case, the fuss this week about 30-day data retention is not a Fable-only issue, and it’s not new.

With that, let’s get back to the hidden downgrade kerfuffle that’s at the core of this article.

“From an enterprise perspective, the 30-day retention requirement deserves attention. Organizations in regulated industries need to understand exactly what data is being retained and whether that aligns with their compliance and legal requirements before they start using these models in sensitive environments,” Cato’s Maor said.

The moral of the story

What strikes me, reading back through it all, is that almost nobody is arguing about Fable’s raw power.

The fight is entirely about the muzzle. One camp says it’s too tight. The same layer that stops attackers also trips up the defenders and researchers who’d build the next generation of tooling, false positives and all.

Another says it barely matters. Motivated adversaries will route around it, the capability is already loose in other labs, and as Lee points out, no restriction survives contact with thousands of employees and a determined insider.

Also: Switching to Claude? Here’s how to take your ChatGPT memories with you

Then, a few experts give Anthropic genuine credit for shipping something this capable without it being reckless, provided the safeguards actually hold. In my opinion, it is credit the company genuinely deserves.

Here’s the main theme. These experts don’t agree on whether Fable is too restricted, not restricted enough, or about right, but they all agree the restrictions, not the intelligence, are the story. For a model named after a moral lesson, that’s fitting.

Do you think Anthropic made the right call by turning hidden safeguards into visible ones? Let us know in the comments below.


You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.





Source link

Leave a Reply

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Recent Reviews







Harbor Freight tools have become one of the primary points of purchase for DIYers looking to tackle various jobs on the home front without spending too much money on the tools required to accomplish each task. Over the years, the family-owned hardware chain has continued to build out its lineup of offerings, and these days even offers a full range of trailers, heavy garage gear, and even tow packages fit for off-road adventures.

You may not realize it, but Harbor Freight has also secured ownership rights over many of the most notable tool brands you’ll find available through its brick-and-mortar stores and its online retail outlet. One of the more respected names you’ll find among Harbor Freight’s in-house offerings is that of Icon Tools, which makes a full line of non-powered hand tools for virtually any job you can imagine.

While the budget-friendly pricing make Icon Tools ideal for the non-professional workers of the world, the brand’s offerings are, by and large, considered professional grade in quality. That fact alone should make them hard to resist for any DIYer in need. It’s worth noting, however, that some of those pro-graded Icon tools are a little more budget-friendly than others. Some can currently even be purchased for less than $50 through Harbor Freight Tools outlets. Here’s a look at 5 tools in that category that users have deemed to be well worth buying.

Professional 4-Piece 10 mm Socket Set – $9.99

Whether you’re putting together your first mechanic’s tool set, or just adding on to the kit you’ve already assembled, any home tinkerer would be wise to keep an eye out for a good socket or two. That is particularly true of 10 mm sockets, which some Harbor Freight Tools shoppers insist you just cannot have enough of in your tool kit. If you find yourself searching for 10 mm sockets from Harbor Freight, Icon’s 4-Piece Socket Set is as highly-rated an offering as you’ll find, and the set will cost you just $9.99.

As for what you get in that small socket set, it includes one shallow and one deep 10 mm socket in both 1/4-inch and 3/8-inch size. Those sockets are made from hardened chrome-moly steel to provide extra strength and torque, and given a high-polish finish to reduce the risk of corrosion. Their thin-walled design and chamfered openings are designed to provide easy fastening and a firmer hold during use. They’re also backed by Icon’s lifetime manufacturer’s warranty.

If all that wasn’t enticing enough, this 10 mm socket set currently holds a 5-star rating from Harbor Freight customers, which is itself based on 264 reviews of 4-stars or higher. Only 8 of those are 4-star, by the way, and even those admit the set is extremely well-made, with one even comparing them favorably to Snap-on sockets. The 5-star reviews are, obviously, equally glowing, with many praising Icon for not only having the foresight to offer a standalone 10 mm socket set, but making it in such high quality.

Professional 4-Piece Mini Screwdriver Set – $14.99

Speaking of essential items for any homeowner’s tool kit, a good set of screwdrivers is high on the list. Not all screwdrivers are the same, of course, with some slotted (AKA flat head) and Phillips head models proving too large for use in tight spaces. Thus, it can be smart to have a set of smaller screwdrivers around for those occasion when space is at a premium. In such a case, Icon’s 4-Piece Mini Screwdriver Set may be an ideal choice at a cost of just $14.99.

This set is designed for use in small spaces, with Icon capping their length at just 6-inches. Each of those drivers is made from special alloys to increase durability, and fit with an ergonomic handle for comfort during use. They’re also chrome plated for corrosion resistance and fit with precision-machined magnetic tips to hold screws tight while driving. There are also drivers in wider and slimmer sizes, the latter of which are small enough for use with JIS (Japanese Industrial Standards) gear.

This set is well-liked by many home tinkerers like YouTuber MECHAWORKS, with several 5-star user reviews from Harbor Freighters specifically noting they bought this set to use with Japanese made engines or electronics. Others claim the drivers are unexpectedly high quality for the price, with one admitting that fact led them to consider buying a full Icon tool set. There were, however, a pair of 1-star reviews bringing the overall rating to 4.8-stars, with one claiming their driver broke during usage, and the other questioning the claims of a magnetic tip.   

Soft Face Dead Blow Hammer – $24.99

While it can be used to perform the functions of a traditional hammer, a dead blow hammer is a strike device designed for different areas of usage. More specifically, it is a mallet-styled tool designed to reduce the level of bounce-back from each strike via a shot-filled head and a rubberized coating. The increased control makes it an ideal option for use in automotive endeavors like chassis work and suspension jobs, as well as woodworking projects and certain machining gigs. While Harbor Freight carries dead blow hammers from other brands, few are quite as well rated by customers than the 24-inch model made by Icon.

At present, a total of 274 users have chimed in on their Icon Dead Blow Hammer, bestowing upon the hammer an overall rating of 4.9-stars. As for that lone 3-star rating, the user questioned the materials used in its making to the point that they claimed it isn’t a dead blow hammer in the truest sense. Few of the other reviewers agreed, with most hailing it as a first-rate dead blow option that is ideal for automotive work and easy to manage in hand. One even hailed the hammer as, “the best product Icon sells.”

Apart from the shot-filled head and rubberized face, they also boast a steel shank and are covered in Polyurethane materials that make them resistant to many chemicals common to garages and workshops. The hammer is also backed by Icon’s lifetime warranty, and can be purchased for just $24.99. As YouTuber Last Best Tool points out, that considerably less than a similar Snap-On hammer for about the same quality.

35-Piece Locking Flex-Head Ratchet and Bit Set – $34.99

We already covered a well-rated socket set from Icon, so it seems fitting that we also cover a ratchet and bit set. This 35-Piece Ratchet and Bit Set features far more pieces than the other, of course. To that end, it understandably costs more, with Harbor Freight pricing it at $34.99. For the record, the kit is also not quite as highly rated as the socket set, though its 4.9-star rating is, arguably, more impressive as it is based on a whopping 2,387 user reviews.

Not all of those reviews are positive, with complaints ranging from soft bits and rusting to faulty parts and design and excessive back-drag from the ratchet head. Some of the positive reviews also note similar issues, by the way, even as the bulk of users and YouTube reviewers praise the kit for being durable and effective. Many Harbor Freight shoppers claim the inclusion of so many bits makes the kit incredibly versatile too. Several also claim its size makes it not only ideal for engine work, but easy to stow away in your car or even a motorcycle.

If you’re breaking down the cost, the $34.99 basically prices each piece of the kit at $1. So, if you’re curious as to what is included, the 1/4-inch chrome-vanadium steel Flex Head Ratchet is the biggest piece, though the kit also includes a 4-inch extender. As for the S2 steel bits, there are 11 TORX bits, 2 slotted bits, 3 Phillips bit sizes, 13 hex bits, and 3 Pozidrive bits, all of which fit inside a handy carrying case.

11-Piece SAE Professional High-Torque T-Handle Hex Key Set – $44.99

As previously noted, screwdrivers are a legitimately essential part of any tool kit, but not every fastener is fit with either a slotted or Phillips head. And yes, if you find yourself staring at a head with a hexagonal opening, neither type of driver will do you much good. In fact, only a hex key will suffice in that scenario, and even then, only the exact right size of hex key can move that fastener.  It stands to reason, then, that if you often deal with hexagonal fasteners, it might be wise to have several sizes of hex tipped drivers on hand when you need to tighten or loosen them.

Enter Icon’s 11-Piece T-Handle Hex Key Set, which is currently selling for $44.99 through Harbor Freight Tools. The keys in that set are designed for fasteners in SAE measurements, and range in size from 5/64-inch, 3/32-inch, 7/64-inch, 1/8-inch, 9/64-inch, 5/32-inch, 3/16-inch, 7/32-inch, 1/4-inch, 5/16-inch, and 3/8-inch. Each of those hex keys is made from black oxide coated steel for durability, and the T-Handle design allows for a short hex tip on the end of the ergonomic handle, as well as a longer shafted tip for heavier torquing jobs.

Users are overwhelmingly impressed with the set as well, rating it at 4.8-stars through Harbor Freight. Of the happy users, many praise the set for its variety as much as they do for the overall quality and design of the tools, noting that the T-handles are not only comfortable to use, but allow for extra torque. They also love the lifetime warranty that comes with them.

How we got here

In assembling this list, we scoured the Harbor Freight Tools website to examine every tool bearing the Icon branding that is currently listed with a sticker price under the $50 marker. We also limited our selections to Icon tools that have earned a user rating of at least 4.8-stars and currently show reviews from at least 50 Harbor Freight customers. Whenever appropriate, some reviews may have been cited directly to ensure accuracy. If possible, additional reviews were also consulted to prop up the consumer point of view. 





Source link