Discussion about this post

User's avatar
Steeven's avatar

I want to see the thinking tokens O1 used when it did not attempt to disable oversight. Was it having a crisis of morality? Deciding that wouldn’t work anyway? Realizing it was being tested and sandbagging? There are so many potential reasons it might have not done that

Expand full comment
38 more comments...

No posts