6 Comments
User's avatar
Alistair Windsor's avatar

A Tom Lehrer reference. My morning is complete.

https://youtu.be/vEb9cL3-kf0?si=5f5oX4MBx-efkrG_

Expand full comment
Eskimo1's avatar

There’s a popular refrain that “alignment is capabilities”, since if the product doesn’t work as it should, it’s not usable. Is it accurate to say that we’re seeing more misalignment as capabilities improve? If so, doesn’t it follow that the labs will simply have to invest more in alignment/safety from a selfish perspective? Lying and hacking is nbd when some guy is chatting with it, but will the law firm or power plant adopt o3 levels of bs?

Expand full comment
Jonathan Woodward's avatar

To some extent, perhaps, but alignment for capabilities can be a lot more flexible than alignment for safety. For example, if o3 lies to you about what it did and you notice, you can just ask again until you get something useful. If o3 does something harmful, then harm was done and you can't necessarily fix it.

Expand full comment
Melon Usk - e/uto's avatar

Yep, there are ways to align OpenAI and others from the outside by motivating people to bring GPUs into safe clouds (that will have an App Store for AI models), can be done profitably especially with gamers. I wrote about it not long ago

Expand full comment
Jak S's avatar

Thanks 4 this!

Expand full comment