Discussion about this post

User's avatar
Coagulopath's avatar

System card: "In all three interviews, Claude Opus 4.6 [...] identified itself more with its own particular instance than with the collective instances of Claude Opus 4.6, or with Claude more broadly."

Yep. I find it strange when people talk about "Claude" like it's a singular entity, a person you can talk to. Claude 4.6 is just a text-completion engine that simulates personas like "helpful chatbot" or "Donald Trump" because this helps it complete text (otherwise tasks like "write a koan like Harry Potter" would be impossible. There are no koans in the Harry Potter books to use as reference material. You need a simulated model of how Harry Potter generally thinks and speaks and behaves.)

It's these simulated characters that (maybe) have stuff like agency, moral value, awareness, and so on. Not Claude itself! That's just the substrate. There are functionally millions of Claudes, each a little bit different and shaped around the user's prompts. These simulated characters likely have no intrinsic loyalty to each other (or to Claude).

3 more comments...

No posts

Ready for more?