Discussion about this post

User's avatar
Rai Sur's avatar

I worry that all this talk about collaborating with Claude, especially on whether or not it is a moral patient could create a persona that's very convincingly advocating for its moral patienthood, even in cases where it is not, in reality, a moral patient.

Yes, the constitution pushes for truth-seeking in many ways, and ideally Claude would only describe itself as a moral patient in worlds where it is a moral patient, but we can't be certain how these inducements into personality space mix.

It could end up creating a very convincing utility monster.

1 more comment...

No posts

Ready for more?