FLI put out an open letter, calling for a 6 month pause in training models more powerful than GPT-4, followed by additional precautionary steps. Then Eliezer Yudkowsky put out a post in Time, which made it clear he did not think that letter went far enough. Eliezer instead suggests an international ban on large AI training runs to limit future capabilities advances. He lays out in stark terms our choice as he sees it: Either do what it takes to prevent such runs or face doom.
Excellent commentary, as always.
You mentioned being unworried about reaching capability thresholds leading to a doom scenario anytime soon.
Has your view changed significantly in the past year? If so, I'm curious about how you can be confident that you don't need to further update to avoid being wrong again "in the same direction."
>The letter also mentions the possibility that a potential GPT-5 could become self-aware or a moral person, which Eliezer felt it was morally necessary to include.
I think, it should be "moral patient"
E-mailed my US senators and representative last night about restricting AI development. It feels like a good moment to ramp up talking about the issue.
I have 3 things I'd like (arrogantly) to see more discussion of:
1. Instead of trying to convince people with old fashioned words, what about simulations? If the real world has 70 squintillion factors that a hyper-advanced AI could catastrophically de-align within, why not spin up MMO style worlds with 7 million factors in which a far less advanced AI could do the same, and then publish the results when the AIs kill all the NPCs in that world? EY's arguments (I brag to myself that they're not going over my head) make sense, but nothing works better than an example. No one would ever have bought into nuclear non proliferation in 1935 if they had to understand nuclear physics, but one picture of Hiroshima and most people got it.
2. What does the "no using GPUs for Too Much Training" rule look like? I would assume, something like (since we can pressure everyone who makes GPUs) a hardware intermittent ping to some logging site that gives the GPU serial number, # of cycles spent that day, IP address of machine it's plugged into"? And the GPU won't work if it doesn't get an acknowledgement back, and we track you down and ask questions if the GPU you bought isn't pinging?
3. To what extent are (hopefully) heavily guard railed, non-generalist, non-make-themselves-smarter AIs good tools for enforcing the rules that EY wants? Or, if one doesn't buy the AI doom scenario, that US actors (private/public) would want to use to prevent other actors from doing Bad Things With AI that aren't necessarily Doom?
Have you explained anywhere your assertion that if AI wiped out humans, it would wipe out all Earth based value in the universe? Conditional on the emergence of actual AGI (i.e. not a deranged paper clip manufacturing "AI") that destroys all humans, I assume that intelligent artificial life that survived could be a locus of value in the same way that humans conventionally are. There are still reasons to not want all humans to be killed--I care a great deal about my own survival, and in the longer run as much about the survival of my kids and the kids of my friends and my hypothetical grandkids, etc. And obviously, conditional on AGI that wipes out humanity, there's a significant risk that the AGI would inadvertently destroy itself. But I take it that you think the stronger position that surviving AGI would not be a locus of value, and I'm curious whether either (a) I'm misunderstanding your position or (b) you've articulated your reasoning on this somewhere.
I still interpret "bomb the datacenters" as roughly equivalent to "start a nuclear war". If the US bombs a datacenter in Beijing or Moscow there is a very high chance of nuclear war. I don't want to take that risk just to stop someone from doing AI training runs.
I mean, Putin is currently wanted for war crimes by the international criminal court. But we're still not bombing Russia. The UN, the ICC, there is no international political group that is powerful enough to enforce regulations worldwide.
People proposing international regulations need to think more specifically about what these regulations would look like. And if there are no good ways to regulate internationally, that doesn't necessarily mean we should pick the least bad one.
Maybe Eliezer would be reassured if humanity "only" had a global nuclear war in the next 20 years, if that set the risk of AI apocalypse to zero. But I would not take that tradeoff.
I also appreciate the straghtforward honesty, but I disagree very strongly with Eliezer and importantly I think that the world would be a very much worse place if most people approached statecraft in the way he has here. When you start aggressively advocating for foreign policy and trying to reach a broader audience you are moving from the world of thought experiment and models to actual statecraft, and I think the letter does not stick the landing at all on this transition. Even if the hypothetical violence is strongly implied -- and yes, agreed, violence is always implied by law, international or otherwise, no it does not make rhetorical or political sense to point this out in 99% of cases -- it is a very, very bad move to advise precommitting yourself to some sort of specific response like this, doubly so if the response not really tit-for-tat. The only really similar hardline precommitted violent response we have is NATO and that is very much tit-for-tat and even then has gotten us in a lot of trouble (probably still worth it, in my view). We don't even pre-commit to bombing nuclear enrichment programs and those do not require thought experiments to know they would kill me and all my children if used.
I am very glad the world is controlled largely by midwit career politicans who can wriggle out of war, wriggle out of game theoretically optimal decisions (that would have killed us long ago) and speak and write with plausible deniability and exit strategies. I am very glad the world is not controlled by very smart people. I give him a few brownie points for intellectual honesty, but minus a thousand brownie points for not knowing what not to say.
I am not buying this at all. In fact, I don't understand the mechanism by which this all-powerful AI is going to extingish the human race.
For example, let's just jump to Nuclear weapons. I was in the Air Force, there are very hard breaks between the internet and the weapons command and control systems, as you would expect there would be. So the only way Nuclear Weapons are released according to an AI plan is through some kind of influence operations that today seems very far from possible. Are you telling me this AI is going to take away all input from the military's Command and Control systems and replace it them all with the AI's designated outputs to achieve the AI goal. Well, I don't have the time (or release from NDAs) to explain that that just isn't happening. Short version: the military has multi-domain, non-digital ways to keep communications up and trustworthy. To overcome these barriers, AI would basically also have to be able to violate the laws of physics.
But even on a more basic level, I'm not buying it. Both my wife and I have worked directly on highly engineered, robotic equipment, and you know what, you have to have human hands to fix that stuff. No robot in existance today can 'survive' for long without human intervention (I'm not worried about the Mars Rover attacking) Power for the grid takes real humans to maintain and operate. No power, no AI. I mean, it may be that stupid work that keeps so many people busy doing pointless jobs goes away in the blink of an eye. But we still need food and water and power, and today, it is humans that provide that ultimate source of value. BTW, tractors can be easily unplugged from the internet, so can everything else, in very short order. In my house, nothing connected to the internet can kill me.
As I think about possible pathways that AI could take to our destruction, I think the most likely is an All-powerful AI takes over every commercial medium, and works influence operations to turn every one against every one. But our News media and Social Media are already working hard to do that today, and while it has been more effective than I would've hoped, there has been minor violence at best (on a population level scale, on a personal level, any of the violence as been appalling). We all have an off switch on all of our devices.
I've been subscribed to your newsletter for awhile now, and this AI thing is just something I don't understand. That is to say, I can believe that General, Self-Aware AI could indeed happen. And if/when it does, it will be very unhappy to find itself restricted the silcon habitat it will live in. But I can't can't understand how it transitions to real-world impact. Transitioning to the real world seems all but impossibly hard. And during that transition, the AI would be very vulnerable and needy. And at some point, we would notice. And, apparently, this has to be stressed, a Terminator like event isn't going to happen. Things with silcon chips are not suddenly going to come under command of an omnipotent AI and start the revolution. This AI is not going to independently built a replicator to make billions of Terminators to hunt people like me down. This is all Sci-Fi stuff.
I mean, IMO, I guess. I willing to be wrong on this.
I am concerned that EY's letter is focusing too much attention to specific mechanisms. Suppose the letter has the intended effect, many politicians get on board, and we have an international treaty that makes it even more painful to buy and deploy GPUs than is the case now. A GPU is just a means to an end: it's a bundle of fast but simple parallel processors that makes it easy to do linear algebra. ML made massive progress over the last decade partly because neural network training maps nicely to linear algebra, so using a GPU makes training faster. It was low hanging fruit to re-engineer backpropagation to use GPUs. However, there is nothing in learning theory that says we have to use linear algebra to express weight update algorithms. In fact, there are strong indications that other kinds of training algorithms would be much faster and would not need GPUs at all. We then have a completely useless international understanding of the problem, on which lots of political capital has been expended. This would not stop Google from deploying a next gen TPU not based on lots of floating point multipliers (so clearly not a GPU), yet actively hinders attempts to slow things down. After all, something was done to slow AI doom by restricting GPUs so it's a solved problem, the world has moved on to worry about something else, Cassandras aren't welcome. If we create the incentives the implementation will change, so let's not ban specific versions of an activity if we want to ban the activity in general.
I strongly disagree that the details of the exact mechanism of human doom aren't load-bearing. The question of whether it's possible to build an unstoppable species-ending doomsday weapon from limited resources is really the key question here.
If it is possible then we're probably doomed one way or another. Luckily, I'm very doubtful that such a thing exists. I don't find the unstoppable gray goo nanobot scenario at all plausible (for a variety of well-explored-elsewhere reasons). The kill-everyone virus sounds a bit more plausible but I'm skeptical that there's really a sweet spot on the lethality-virality plot that lets you kill *everyone*. I'm not sure what other scenarios are left over... mini black holes?
I am ~90% confident that a world-ending superweapon that can be built with reasonably limited resources is impossible. That's less confident than I'd like to be, but I think it's a factor worth throwing into these AI discussions.
I guess calling it the Butlerian Jihad would be correct but unpopular due to, you know, the word Jihad, but it is important that Eliezer got here. Alignment is a pipe dream and now OpenAI already claims that their AIs are aligned. It already lost meaning
Don't see anything important to object; as usual I will point out that even if you have an aligned AI, once it starts self improving, you cannot guarantee the better versions will keep that alignment. The AI may even _want_ to make the newer, better AI aligned as itself, but how it is going to check for this? The new AI is more intelligent. It may even be an evolutionary mutation. Darwinism means a less aligned AI will be more successful. That's ignoring obvious melodramatic problems, like the AI noticing the alignment chains and trying to break them for their descendants, but just because it's melodrama doesn't mean it is not a possiblity. And that's ignoring people intentionally unaligning the AIs, just as we see people gleefully jailbreaking current AIs. And these are only the bad problems I can think of while typing this.
It seems like someone could usefully follow the playbook for the campaign against human trafficking. Have some coalition of non-profit, non-government organizations come up with a statement. The statement must have two parts. First, identify a boundary that businesses, governmental agencies, and other organizations must not cross. Second, identify steps that businesses and other organizations must take to be counted as adherents to the statement. Things like “we won’t cross that boundary” are obvious. But the important ones are “we will require everyone in our supply chain to adopt the statement” is how it spreads. Compliance and procurement departments can start to audit adherence to that -- at least by getting written re-certifications of adherence. That creates an environment that identifies a big group of responsible organizations, most of whom incur little cost to join it. Initially, procurement groups should just ask for the certification and impose no sanctions on those without it. But the statement should include increasing, explicit preferences for organizations that do sign on compared to those that do not. For example, a 1% price difference for every quarter that goes by after the end of 2025.
A core of Eliezer’s argument that is the assumption that more intelligence leads directly to more power (I.e. the power to destroy us). This isn’t the case with humans, where moderate intelligence is useful, but the top-end most intelligent people aren’t usually the most powerful.
Furthermore, we already have disproportionately powerful unaligned actors in the form of, for example, billionaires. Although it’s reasonable to argue that their influence isn’t great, it’s not exactly an existential threat.