Discussion about this post

User's avatar
loonloozook's avatar

>A bit of a downgrade, sadly, although it seems like a wise way to do things:

That's it? There is a whole debacle on Twitter and Reddit that users on Pro and Max plans reach their limits abnormally fast in Claude and Claude Code (me as well – had to switch to Codex this week) with Anthropic's response being rather vague and muted.

P.S. Also a bit surprised that the Mercor leak is not mentioned.

Matt Springer's avatar

I've been surprised at EY's repeated emphasis on cutting evil robot stories out of the training dataset. It seems like obvious lacunae in the training data would be about the very first things a sufficiently advanced intelligence would notice, and he's the kind of person who would point that out first if someone else suggested it. Maybe he has? Also the guy runs an AI research organization. Just fire up NanoGPT and see what happens.

At any rate it's not obvious that it would produce positive results even with today's AI. Typically it helps for the internal representation to be able to model undesirable behavior so that you can stick a negative sign in front of it via RL.

57 more comments...

No posts

Ready for more?