Discussion about this post

User's avatar
William D'Alessandro's avatar

Agree with this diagnosis. Another general issue for deontological-style alignment is that there seems not to be a simple and obvious way to deal with things like risk, tradeoffs and the possibility of conflicting obligations -- e.g. if the rule is "don't do X", and the model thinks there's an n% chance that taking action A will cause X, under what circumstances can it do A? What if there's a forced choice between two different types of forbidden action? Or between an action which is epsilon percent likely to break a major rule and one that's 1 - epsilon percent likely to break a minor rule?

Seems like the choices are either to rely on whatever passes for the model's common sense to resolve these sorts of issues (yikes) or to try to specify all the relevant principles by hand (which takes you out of nice-simple-deontological-rules territory, and which you won't get right anyway).

Expand full comment
rxc's avatar

I'll give you something for your AIs to think about. Assume you have an AI running a machine that has defined limits on its operations (e.g, "do not move faster than X mph" , or "the pressure in this tank cannot be allowed to go above Y psi") or "Do not run operate this pump to pump more than Z gal/min"). But then a situation arises where you MUST exceed one or more of these limits because otherwise the consequences will be extremely dire. The engineers who designed the equipment included a certain amount of margin in the design, and they know that the equipment can operate beyond the stated limits, but they don't specify this in any of the operating manuals or instructions. This knowledge is passed from operator to operator.

What does the AI decide to do when confronted with this situation? This is not a hypothetical - it happens more frequently than you can imagine, in airplanes, in industrial situations, on board boats and ships, and sometimes in automobiles which operate on pubic roads.

Expand full comment
11 more comments...

No posts

Ready for more?