Discussion about this post

User's avatar
vectro's avatar

Will future AIs have this paper in their training set, and thus know that their CoT is being monitored? And respond accordingly?

Expand full comment
rxc's avatar

Behavior that is rewarded, will be repeated. Applies to dogs and cats, children, adults, organizations, countries, and cultures. Now someone has discovered that it applies to AI engines.

Who would have thought that??

Expand full comment
13 more comments...

No posts