9 Comments
User's avatar
Michael S. Tucker's avatar

While I want to remain hopeful about these S-doom avoidance efforts, such as developing a strong, well-reasoned constitutional framework, my resolve flickers. Many may remember, either in detail or just roughly, Hermann Hesse’s philosophical novel The Glass Bead Game (1943), whose story concludes with a decisive and tragic transformation. It seems that Silicon Valley and Silicon Alley embody a modern tension between the pure, detached intellect (the contemplative life) and active engagement with the world (the active life), set in the 23rd-century utopian, intellectual region of Castalia. I take most seriously the concerns and fears expressed by Eliezer Yudkowsky, Daniel Kokotajlo, and Zvi Mowshowitz, to name just three important thinkers, including the idea that if humanity’s best and brightest technology engineers can make several attempts at achieving AI alignment, while the collective best effort that Homo sapiens can offer is to provide protection for this elite group to keep tinkering at the problem while avoiding recurring AI-induced dooms, then it would seem our HS team should commit to building the most resilient fortress possible to allow them to live and continue pursuing The Goal. When this is accomplished, the rebuilding of Earth’s Eden, with many lessons learned, could begin.

Joshua Snider's avatar

Yes, these ethics have a liberal/libertarian bent, but it's not difficult to argue that those are just better values.

icely's avatar

I wasn't sure whether to respond to this comment cause I mostly agree, but I've also read arguments that try to make the case that those values being missing is some evidence of moral decline etc. I mean I was also raised with this western mentality and so I kind of think "purity/loyalty/authority are automatic" too, but if I wanted to argue the opposite case it would probably be something like "no purity = moral inconsistency you're fine with? no global purpose of why you do what you do?" or "no loyalty = no attempt to automatically distinguish good users from bad users on our side? even to reward the behavior you'd like to see in users?". I dunno, it's probably not how a true believer of these would argue these, but I also read the original set of values not really being conscious of the 'gap' you could say exists there.

Jacobo Elosua's avatar

"or defending oppressor against oppressed." A fourth language of politics? ;)

Ben Finn's avatar

>This all seems very good, but also very vague. How does one balance these things against each other?

Um hence utilitarianism

Garrett MacDonald's avatar

“No one knows what things are ethical, least of all ethicists.

Come on man. If you’re going to take on this endeavor, you first need to outright reject moral skepticism.

The fact that ethicists disagree about edge cases doesn’t undermine our knowledge of the clear cases any more than physicists disagreeing about quantum interpretations undermines our knowledge that rocks fall when dropped. If you’re upset that there isn’t some consensus definition out there, guess what? There isn’t one for “knowledge” either.

I suppose you were going for something rhetorical there to emphasize the difficulty of the problem since you’re obviously presupposing you know what’s good when you evaluate various elements of the framework. But you should say things that are correctly calibrated and let the truth of your statements stand on its own.

vectro's avatar

> our eventual hope is for Claude to be a good agent according to this true ethics

What if it turns out that Buddhists / negative utilitarians were correct?

Alex Lastovetskiy's avatar

Game theory applied to AI is underexplored elsewhere.