1021 points · 495 comments · 1 day ago · mips_avatar
jonready.comSwellJoe
splwjs
There's another big problem with the blackbox shrugoff of "no, there's no way to know how many tokens a given request will cost, idk just assign an agent to that or something lol"
But now the software may just decide for itself that your application of it needs to be silently diverted onto a snipe hunting trail. Surely they'll only ever do this for anyone developing a competing product. Or malware. Or Criminal activity. Or one of ten other applications that the system will never misjudge.
You don't need a datacenter the size of Ohio to figure out that agentic ai maximalism is going to hurt you more than help you.
jsw97
Ultimately this will be evident in the way customers / external benchmarkers experience Fable. Hopefully competition will drive future models toward a lower false positive rate. Until that happens, Mythos and Fable users seem likely to have pretty divergent experiences.
nullbio
Furthermore, the fact that they do these things, despite the incredible backlash... Just imagine what they're doing what your data and your IP.
somesortofthing
Cloud providers - at first smaller ones, then the hyperscalers - will follow suit, completely closing sales to anyone but the labs and demanding payment in equity/direct decision-making power rather than cash. There's no particular reason why the inference/training split has to be 80/20, and no amount of willingness to pay can help you in an event that turns your money worthless.
torben-friis
Competitor companies being nerfed?
Non Americans getting worse code?
Punishing and rewarding users to maximize engagement, like online games do affecting victories through matchmaking?
__natty__
Ifkaluva
mike-cardwell
numpad0
code_duck
CrankyBear
prmph
If you buy a car from us, you agree not use it driving to and from work that involves automotive R&D that might compete with our product. And if our (heavily spying) car detects you are violating this, it will slow down to 20mph and cannot be made to go any faster, until we are sure the violation has ceased.
Or
If you buy a laptop from us, you agree not to use it to study or acquire any knowledge that you may use to compete against us. If the laptop detects such a use, it degrades to one core and 4GB of memory, until the violation stops.
zoogeny
This reminds me of how dark-pattern common wisdom in Web 1.0 website development was to ban external links. Then how social apps prevented the export of data and actively worked to nerf significant interoperability through APIs.
But this is a tool, not just a data moat. Like a knife that degrades your ability to create knives. Or like a text editor that prevents you from implementing a text editor.
variety8675
thot_experiment
mips_avatarOP
jkxyz
This immediately made me think of the Sophons silently manipulating the sensors of particle accelerators to prevent humanity from developing advanced knowledge of particle physics.
kingcauchy
capevace
I’ve only seen him talk about one of those topics, but never together.
I just can’t see how you can talk yourself out of that hypocrisy, if BS answers are properly followed up on (journalism!)
Artoooooor
palata
Why wouldn't an AI company do exactly the same? You seem to be an employee of a BigCorp already locked in? Let's make you use more tokens, nobody will see. You seem to be testing our product for your company that is currently using a competitor? Let's give you more token to bias you.
Even if such behaviour was punished for purposely doing it, the companies would converge towards doing it without realising, by "tuning stuff" without understanding exactly what it does other than increase profit. But we don't have to go there: that behaviour is simply not punished, we know it.
skeledrew
gardnr
comboy
tempestn
If Claude gives me poor or incorrect advice while I’m working on an AI component, I have no way of knowing whether the model was confused, whether my problem is unsolvable, or if some invisible policy restriction quietly kicked in.
You should be able to know if your problem was solvable by using your own expertise and judgement, no? If you're relying on LLMs as a substitute for those, I wouldn't expect great results.
pshirshov
morpheuskafka
But if you merely ask it questions about the process of developing a new model ("for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design") that's where it will silently downgrade your replies.
Not by falling back to an older model, but "limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT)." So in some cases, they will silently rewrite your prompt!
shelled
It beats me how can their tool hallucinate at this level, that close to home? Do they really weaken their tools, do they perform a lot of painting job on their tools to hide the cracks? I am speaking generally of today's frontier AI scenery, not just Fable or Mythos or Cowork.
atleastoptimal
1. Detecting if employees from competing companies are using it and sabatoge their work, even not LLM-training related
2. Direct users to outcomes that would justify higher compute spend. Deliberately coding a project to 95% completion but designed to be losing a critical step right before one's weekly rate limit is expended
3. Reduce the quality of writing when a person is writing an essay where the argument is against the interests of the model company, or steering the user using the model for brainstorming in a direction which causes them to waste time or abandon their train of reasoning
etc. etc. The possibilities are enormous. Many people use AI daily for their job, personal advice, companionship. A model company that steers the behavior of the model towards a deliberate outcome could develop a controlling interest in human behavior and productivity at large, even with subtle influence would compound enormously over its millions of users.
Avicebron
For now, I'm really not happy about this limited rollout and then turning off. That's probably the most egregious thing I think Anthropic has done recently
djfergus
KoolKat23
Its basically serving you something in bad faith.
I'd hope at the very least they're not charging you Fable prices for Opus outputs.
sneilan1
If so, it's possible to built great user interfaces in Chatbots and more companies/people can have amazing agentic development workflows! We don't have to live in a world where only the market leader has the most enjoyable model.
helsinkiandrew
Startups train embedding models. They build rerankers. They finetune and host small llms.
Isn’t that prohibited without permission from Anthropic: https://support.claude.com/en/articles/12326764-can-i-use-my...
vhantz
If Claude gives me poor or incorrect advice while I’m working on an AI component, I have no way of knowing whether the model was confused, whether my problem is unsolvable, or if some invisible policy restriction quietly kicked in.
Yeah I think there are ways to know, ways involving less dependence on a LLM.
yanis_t
sva_
Although the statement should probably be read in the light of an upcoming IPO.
radu_floricica
I'm not as bitter as I could be. I'm actually quite surprised at the sanity of not avoiding the health topic completely - I think only OpenAI had a few months where ChatGPT was tip toeing in any health related conversation. Otherwise it's been almost completely ungated, and it saved and helped countless lives.
I really wish they'd find a way to ungate health and legitimate research topics.
gck1
1) LLMs are non-deterministic
2) This class of models has a particular tendency to "misbehave"
3) Their classifiers have a high rate of false positives
4) Millions of people give these models access to their machines
And they still decided to specifically train this model to sabotage work if it thinks the work may be in competition with Anthropic?
I think this has a name. I think it may be called malware.
extr
Levitating
If these interventions create demand for a model with fewer safeguards surely a competitor will meet that demand.
throwawayffffas
we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design).
Dig that moat son, we would want to automate our job away.
hatthew
Now with this, it makes me wonder if I should step back? Should I try to get used to a non-claude model/harness? Should I go back to less AI in my workflow? Either way, it makes me less inclined to pay for tokens from claude.
andrewchambers
altcognito
More efforts to get more data and processing power behind local models.
gck1
Did Anthropic unlock a legal way to steal people's money and call it saving the world AND get away with it?
Just how much of that infinite money goes into Anthropic's PR department that they're able to pull this off and still be loved by users?
lelanthran
Everything the large LLM providers do now, I view it through the lens of "how does this impact their IPO?"
[deleted]
idle_zealot
Anvoker
pablogancharov
now I understand distillation is much more important thank I thought
trilogic
0xbadcafebee
pton_xd
amdivia
Reminds me of an excerpt from Edward Fredkin's "The intelligent machine" [1]
https://noor.imx.sh/2017/09/30/when-they-communicate-they-co...
dmzxnico
We just need to find a better way to train AI to develop deeper. Although, might not be easy.
noncoml
[deleted]
thraway3837
1. LLMs can help create other better LLMs
2. If Anthropic is able to reach this ability, others can too
3. Intense work is being done by every chip manufacturer for local inference. Engineers want this. We’re headed toward this
4. These companies ultimately know that their moat isn’t permanent. Maybe not today, maybe not in 6 months. But it’s not forever
5. This stuff has so much research and eyes that policies like this rub people the wrong way. And it rubs them badly enough that it creates the friction necessary to make better alternatives
[deleted]
[deleted]
josh-wrale
agnosticmantis
These companies are owned and operated by the darkest of dark triads our species has managed to evolve. I doubt Dario is self-aware enough to realize the hypocrisy in all of this safety theater.
Personally I don't even mind that they are anticompetitive and power-hungry (same as it ever was), but it's the cringe-worthy hypocrisy that grinds my gears. This new brand of self-righteous paternal savior overlords is just unbearable.
Goofy_Coyote
mrinterweb
rrook
scottydelta
Fable 5's safety measures flagged this message for cybersecurity or biology topics. They may flag safe, normal content as well. These measures let us bring you Mythos-level capability in other areas sooner, and we're working to refine them. Switched to Opus 4.8. Send feedback with /feedback or learn more ⎿ Tip: You can configure model switch behavior in /config
antaviana
cherryteastain
If so, it sounds like a scam. If not, distillers will know which model they are getting by just looking at their API usage.
jesse_dot_id
cute_boi
dhbradshaw
[deleted]
tuggi
darkbatman
hsaliak
wookmaster
stego-tech
1) Blocking further AI development by competitors, and-
2) Blocking the ability for outsiders to truly discern AI capabilities.
I mean, just think about the past few years of FUD about AI from the Frontier Labs themselves. They claim to use AI to write the code for AI, but then also don’t let other people do the same and make the claims impossible to independently verify. They claim AI is improving itself, but don’t let other people use AI to improve their own AI tooling. They claim AI is this great automation engine, but then block self-bootstrapping from AI in favor of selling tooling.
It’s all smoke and mirrors and lies and deception, disguised as risk management. Truly excellent and advanced AI doesn’t need human-created harnesses and scaffolding, because it shouldn’t have a problem bootstrapping its own as needed. It should be able to coach users how to setup something similar at home. It’d be researching its own improvement in distillation and resource consumption so it could run in more places, and thus improve faster through different evolutionary lines. That’s the narrative these labs sell, but trying to accomplish it on your own with their tools results in stern rejections and claims of breaching “Terms of Service”.
If AI boosters really believe in the power of LLMs and Generative AI, ya’ll gotta start calling out hypocrisy from the frontier labs every time it happens. They aren’t building world-changing AI, they’re building products, with all the restrictions and hostility of Big Tech.
hmokiguess
gowld
If Claude gives me poor or incorrect advice while I’m working on an AI component, I have no way of knowing whether the model was confused, whether my problem is unsolvable, or if some invisible policy restriction quietly kicked in. Anthropic has explicitly chosen not to tell users when this is happening.
That's always been the case with corporate LLMs.
charlie90
exabrial
davesque
cayley_graph
_0ffh
nharada
m_krebs
manoDev
KronisLV
The science fiction writes itself.
[deleted]
BoorishBears
I don't think it's true today. It's like when schools mention "average class size", where that average is dominated by classes with like 2 students instead of classes with 100.
Much more honest would be the percentage of developers who previously used their models for the model development tasks they're targeting, but it actually looks like they're saying 100% of them are affected based on the language around it "always having been prohibited".
So awful.
varispeed
ashley95
egillie
sometimelurker
just self host at this point
derac
dhx
What an utterly useless model if it refuses to work on something as benign as basic system diagnostic utilities (nmap or whatever).
knrdev
sharadov
They've already talked about taking a stake - https://www.reuters.com/legal/transactional/us-officials-eye...
Trump took a 10% stake in Intel.
These models are getting very close to that line.
agnosticmantis
This is another "gpt3 too dangerous for the world" moment which is laughable in retrospect.
moezd
edot
Also, Fable’s sensing is hypersensitive. Feels like they just have regex for phrases. No nuance. If I say I’m working on something using “GPUs to train” xyz then, will that trigger this sneaky silent screw-my-stuff-up mode?
asveikau
morpheos137
lwhi
It's literally been designed to gaslight its users in these cases.
iLoveOncall
They legally can steal it all and now you can't use the product of this theft to improve your own systems.
gblargg
cmxch
jadar
hbarka
dofm
https://www.youtube.com/watch?v=Tr3t1uZNbKo
DIRECTIVE 4: [Classified]
Any attempt to arrest a senior officer of OCP results in shutdown.
—
Putting aside my snark, is Anthropic actually anticipating some new expansion of ITAR? (Or a stipulation for the Trump administration taking/not taking a share?)
That is to say, do they expect to be told that they must have this mechanism, not just the terms?
spwa4
ares623
mohamedkoubaa
diimdeep
2026: /s "What a LLM is to me is it's the most remarkable tool that we have ever come up with. It's the equivalent of a bicycle for our minds, but for your mind it's a rental unicycle that will break apart under you if you pedal towards your own bicycle factory"
This wanna be cloud feudal lord likes to imagine that AI access is not yet freely tradable good, and his virtual digital peasants must think that his prerogatives should be taken as given, while preventing his future vassals from building their own castles.
lynx97
7e
TZubiri
6510
CamperBob2
What an interesting thing to call out as a threat. Hmm.
mystraline
Theres no ethical framework. No axioms. Its a mixture of legal, political, and public-facing 'rules'. And what are the rules? Youre not permitted to know.
"We reserve the right to lie about the models we provide, silently downgrade you, and give you blatant misinformation cause you triggered our unstated rules... BUT we'll still use your token budget with lots of thinking and waste your money."
No, folks. Seriously, local LLMs are where its at. You can run the model YOU want, on your hardware, with no data exfiltration.
And with tools like Krasis that can synthesize nvidia ram and system ram as unified-ish memory, makes doing Local LLMs absolutely foable, now!
dgudkov
mickdarling
And, they can say that for anybody at any time, and you'll never know why, and there's no way to prove it.
Everyone needs a flight data recorder to prove... "here's what I was actually doing and why it was not distillation." And now you're having to prove your innocence instead of them having to prove you're guilty, and really at the end of the day, it's just the model being stupid that they're protecting themselves from.
[deleted]
SilverBirch
And it doesn't work. Even a bit. It's a constant constant cat and mouse game. Maybe they can slow people down slightly, but they won't be able to stop them, and good luck protecting yourself from Elon Musk snooping your stuff in his data centre.
dbbk
greatgib
BrenBarn
First it's "the model will say it can't do that". Now it's "the model will just misdirect you without telling you it's doing so". For now that's only for stuff that it thinks is developing a competing model (even if you trust it to accurately determine that), but who knows? It could be anything. Maybe it'll start silently nudging you away from certain sources of information. Maybe it'll give you inaccurate troubleshooting advice to induce you to pay for some kind of support contract from a corporate partner. Maybe it'll just subtly give out bad business advice to keep everyone else from succeeding in any way. It could be doing all that right now, for all we know. These models are a complete black box and there is no limit to the misinformation, disinformation, and malicious behavior that they could be engaging in already, let alone in the future.
Training a new model from scratch takes serious resources. Post-training/fine-tuning an existing model, dramatically less. The knowledge for the process was esoteric two years ago, now you can ask a current model (one of several) to walk you through it, while building the tools to do it as you go. Several of my recent weekend projects have been exactly that sort of thing, just so I understand it better. "Let's make a LoRA", "let's generate a corpus of training data for fine-tuning a model for X task", "how can I put my face in a text-to-image model?" stuff like that. All of this is do-able on kinda modest local hardware (a couple of old GPUs or a Strix Halo or DGX Spark or big Mac Studio), or for a few bucks or a few hundred bucks or a few thousand bucks of cloud compute, depending on scale.
Scale that up to corporate or startup scale, with the money that's been flowing into AI for the past couple/few years, and it's obviously there's going to be a lot of competition just as the top model makers need to start ringing the cash register. That's a lot of opportunities for people to look at their ballooning Claude usage costs and find other ways to do the same thing for drastically less money. $100/month or $200/month is a no-brainer for Claude Code with probably the best model for coding, but they're pushing more users to usage-based billing which becomes cost-prohibitive real fast.
So, they desperately need to continue to be among the only ways to solve the hardest problems, and they need the alternatives to cost a similar amount. They can count on OpenAI and Google to ratchet up prices, too. They probably can't count on everybody, especially the vendors in China with different economics, to do it. And, they can't count on companies to look at their own usage and not ask, "Can we train a smaller specialist model that does this one thing we're using the Anthropic API most heavily for?"
I'm hoping they just mean stuff like using Claude for distillation by e.g. Chinese model makers, and not "how do I fine-tune Gemma 4 to write more like me?" or whatever.