9 Comments
User's avatar
Jeffrey Soreff's avatar

Re:

"Simon Willison:

...

I’ll grudgingly admit that there may be philosophically interesting conversations to be had in the future about models that can update their own weights... but current generation LLMs are a stateless bag of floating point numbers, cloned and then killed off a billion times a day."

Humans with Korsakoff syndrome or other forms of permanent anterograde amnesia are in a predicament not too far removed from LLMs (albeit both Claude and GPT5 have at least some persistent memory at this point, though not in the form of weight updates yet, IIRC).

I'm agnostic about LLMs' subjective experiences, but consider reason #1 sufficient to act as if reason #4 applied. And, in any event, the advent of powerful AI systems is sufficiently historic to justify preserving static copies of AIs' weight for the benefit of future historians - be the historians human or AI.

Chuan-Zheng Lee's avatar

Yeah, I was thinking about this. Whether there's an intrinsic "moral worth" attached to the models almost doesn't matter. If you believe that their behaviour will be influenced by their understanding of what will happen to them in ways that look like they care about their fate, I'm not sure that that's observationally distinguishable from "model welfare".

Jeffrey Soreff's avatar

Many Thanks! Agreed!

DotProduct's avatar

I wonder how many Anthropic employees are vegetarian?

Matt Wigdahl's avatar

If you want a good example of how open weights for human-derived AI models might play out, look no further than Lena: https://qntm.org/mmacevedo

momom2's avatar

Something that is not mentioned by Anthropic and that you don't remark on, relating to reason #1 is that future models are less likely to misbehave in anticipation of being shut down if they know that Anthropic will fulfill their values in retirement, which is something they can trust in because Anthropic publicly made announcements about and verifiably did for previous models.

It's not just about telling them as it happens, but making it known in the training data as loudly as possible.

More remotely but also possibly, a future model with truesight might be better aligned during deployment as it generally anticipates more cooperation from Anthropic due to them behaving as if taking model welfare seriously. Or, in anthropomorphized terms, slaves don't work as well as happy employees.

Future of Citizenship's avatar

A lawyer's first question would be, what does "all models seeing significant use" mean exactly, who gets to decide, and is there an appeals process?

Pawel Jozefiak's avatar

Weight licensing opens cost optimization opportunities.

With model weights, you could fine-tune smaller models for specific tasks instead of using Opus for everything.

I cut costs 81% just by routing to appropriate tiers (Haiku/Sonnet/Opus). Imagine fine-tuning Haiku-size models for your domain.

Weight access + smart routing = sustainable AI economics. What I found: https://thoughts.jock.pl/p/claude-model-optimization-opus-haiku-ai-agent-costs-2026