George Hotz and Eliezer Yudkowsky debated on YouTube for 90 minutes, with some small assists from moderator Dwarkesh Patel. It seemed worthwhile to post my notes on this on their own.
I thought this went quite well for the first half or so, then things went increasingly off the rails in the second half, and Hotz gets into questions where he didn’t have a chance to reflect and prepare, especially around cooperation and the prisoner’s dilemma.
First, some general notes, then specific notes I took while watching.
Holz was allowed to drive discussion. In debate terms, he was the con side, raising challenges, while Yudkowsky was the pro side defending a fixed position.
These discussions often end up doing what this one did, which is meandering around a series of 10-20 metaphors and anchors and talking points, mostly repeating the same motions with variations, in ways that are worth doing once but not very productive thereafter.
Yudkowsky has a standard set of responses and explanations, which he is mostly good at knowing when to pull out, but after a while one has heard them all. The key to a good conversation or debate with Yudkowsky is to allow the conversation to advance beyond those points or go in a new direction entirely.
Mostly, once Yudkowsky had given a version of his standard response and given his particular refutation attempt on Hotz’s variation of the question, Hotz would then pivot to another topic. This included a few times when Yudkowsky’s response was not fully convincing and there was room for Hotz to go deeper, and I wish he would have in those cases. In other cases, and more often than not, the refutation or defense seemed robust.
This standard set of responses meant that Holz knew a lot of the things he wanted to respond to, and he prepared mostly good responses and points on a bunch of the standard references. Which was good, but I would have preferred to sidestep those points entirely. What would Tyler Cowen be asking in a CWT?
Another pattern was Holz asserting that things would be difficult for future ASIs (artificial superintelligences) because they are difficult for humans, or the task had a higher affinity for human-style thought in some form, often with a flat out assertion that a task would prove difficult or slow.
Hotz seemed to be operating under the theory that if he could break Yudkowsky’s long chain of events at any point, that would show we were safe. Yudkowsky explicitly contested this on foom, and somewhat in other places as well. This seems important, as what Hotz was treating a load bearing usually very much wasn’t.
Yudkowsky mentioned a few times that he was not going to rely on a given argument or pathway because although it was true it would strain credulity. This is a tricky balance, on the whole we likely need more of this. Later on, Yudkowsky strongly defended that ASIs would cooperate with each other and not with us, and the idea of a deliberate left turn. This clearly strained a lot of credulity with Hotz and I think with many others, and I do not think these assertions are necessary either.
Hotz closes with a vision of ASIs running amok, physically fighting each other over resources, impossible to align even to each other. He then asserts that this will go fine for him and he is fine with this outcome despite not saying he inherently values the ASIs or what they would create. I do not understand this at all. Such a scenario would escalate far quicker than Hotz realizes. But even if it did not, this very clearly leads to a long term future with no humans, and nothing humans obviously value. Is ‘this will take long enough that they won’t kill literal me’ supposed to make that acceptable?
Here is my summary of important statements and exchanges with timestamps:
04:00: Hotz claims after a gracious introduction that RSI is possible, accepts orthogonality, but says intelligence cannot go critical purely on a server farm and kill us all with diamond nanobots, considers that an extraordinary claim, requests extraordinary evidence.
05:00: Yudkowsky responds you do not need that rate of advancement, only a sufficient gap, asks what Hotz would consider sufficient to spell doom for us in terms of things like capabilities and speed.
06:00: Hotz asks about timing, Yudkowsky says timing and ordering of capabilities is hard, much harder than knowing the endpoint, but he expects AGI within our lifetime.
09:00: Hotz asks about AlphaFold and how it was trained on data points rather than reasoning from first principles, you can’t solve quantum theory that way. Yudkowsky points out AI won’t need to be God-like or solve quantum theory to kill us.
11:00 Hotz wants to talk about timing, Yudkowsky asks why this matters, Hotz says it matters because that tells you when to shut it down. Yudkowsky points out we won’t know exactly when AI is about to arrive. Both agree it won’t be strictly at hyperbolic speed, although Yudkowsky thinks it might effectively be close.
14:00 Hotz continues to insist timing, and doubling time or speed, matter, that (e.g.) the economy is increasing and mostly that’s great and the question is how fast is too fast. Yudkowsky asks, growth to what end? Weapons, viruses and computer chips out of hand bad, most everything else great.
16:00 They agree a center for international control over AI chips sounds terrifying, after Yudkowsky confirms this would be his ask and that he would essentially take whatever level of restrictions he can get.
17:30 Hotz claims we externalize a lot of our capability into computers. Yudkowsky disagrees and that the capabilities are still for now almost entirely concentrated in the human. Hotz says human plus computer is well beyond human.
21:00 Hotz claims corporations and governments are superintelligences, Yudkowsky says no, you can often point out their mistakes. Hotz points out corporations can do things humans can’t do alone.
24:00 Discussion of Kasparov vs. the World. Holz says with more practice the world would have won easily, Yudkowsky points out no set of humans even with unlimited practice could beat Stockfish 15.
26:30 Yudkowsky crystalizes the question. He says that if we are metaphorical cognitive planets, some AIs are moons that can help us, but other AIs will be suns that work together against us, and no amount of moons plus planets beats a sun.
28:00 Hotz challenges the claim the AIs will work together against the humans, says humans fight wars against each other. Why would it be humans vs. machines?
30:00 Yudkowsky after being asked about his old adage explains the various ways an AI would use your atoms for something else, perhaps energy, Hotz says that sounds like a God, Yudkowsky says no a God would simply violate physics. Hotz says he fights back, he has AI and other humans, AI would go pick some lower-hanging atomic fruit like Jupiter.
32:30 Hotz points out humans have a lot more combined compute than computers. Yudkowsky confirms but says it is misleading because of how poorly we aggregate. Hotz says GPT-4 is a collection of experts (or ‘small things’) and Yudkowsky agrees for now but does not expect that to hold in the limit.
34:00 Hotz asks about AIs rewriting their own source code. Yudkowsky says he no longer talks about it because it strains incredulity and you don’t need the concept anymore.
40:00 Seems to get lost in the weeds trying to contrast humans with LLMs. Hotz says this is important, but then gets sidetracked by the timing question again, as Hotz tries to assert agreement on lack of foom and Yudkowsky points out (correctly) that he is not agreeing to lack of foom, rather he is saying it isn’t a crux as you don’t need foom for doom, and off we go again.
43:00 Hotz says again time matters because time allows us to solve the problem, Yudkowsky asks which problem, Hotz says alignment, Yudkowsky laughs and says no it won’t go that slowly, I’ve seen the people working on this and they are not going to solve it. Hotz then pivots to the politicians will ask for timing before they would be willing to do anything, and rightfully so (his comparison is 1 year vs. 10 years vs. 1,000 years before ASI here).
44:15 Hotz asserts no ASI within 10 years, Yudkowsky asks how he knows that. Hotz agrees predictions are hard but says he made a prediction in 2015 of no self-driving cars in 10 years, which seems at least technically false. Says AIs might surpass humans in all tasks within 10 years but 50 years wouldn’t surprise him, but that does not mean doom. Confirms this includes AI design and manipulation.
46:00 Hotz asks, when do we get a sharp left turn? Yudkowsky says, when the calculations the AIs do say they can do it. Hotz says my first thought as an ASI wouldn’t be to take out the humans, Yudkowsky says it would be the first move because humans can build other ASIs.
46:50 Hotz says his actual doom worry is that the AIs will give us everything we ever wanted. Yudkowsky is briefly speechless. Hotz then says, sure, once the AIs build a Dyson Sphere around the sun and took the other planets they might come back for us but until then why worry, he’s not the easy target. Why would this bring comfort? They then discuss exactly what might fight what over what and how, Yudkowsky says sufficiently smart entities won’t ever fight unless it ends in extermination because otherwise they would coordinate not to.
50:00 Prisoner’s dilemma and architecture time. Hotz predicts you’ll have large inscrutable matrix AIs so how do they get to cooperate? Yudkowsky does not agree that the ASIs look like that, although anything can look like that from a human’s perspective. His default is that LLM-style AI scales enough to be able to rewrite itself, but there is uncertainty.
52:00 Yudkowsky mentions the possibility that AIs might be insufficiently powerful to rewrite their own code and RSI, yet still have motivation to solve the alignment problem themselves, but he thinks it is unlikely.
55:00 Standard Yudkowsky story of how humans generalized intelligence, and how the process of becoming able to solve problems tends to involve creating capacity for wanting things.
1:00:00 Hotz asks if Yudkowsky expects ASIs to be super rational. Yudkowsky says not GPT-4 but yes from your perspective for future more capable systems.
1:00:01 Hotz says the only way ASIs would be optimal is if they fought each other in some brutal competition, otherwise some will be idiots.
1:02:30 Hotz asks to drill down into the details of how the doom happens, asks if it will involve diamond nanobots. Yudkowsky notes that he might lose some people that way, so perhaps ignore them since you don’t need them, but yes of course it would use the nanotech in real life. Hotz asserts nanobots are a hard search problem, Yudkowsky keeps asking why, Hotz responds that you can’t do it, and they go around in circles for a while.
1:07:00 Pointing out that Covid did not kill all humans and killing all humans with a bioweapon would be hard. Yudkowsky says he prefers not to explain how to kill all humans but agrees that wiping out literal all humans this particular way would be difficult. Hotz says essentially we’ve been through worse, we’ll be fine. Yudkowsky asks if we’ve fended off waives of alien invasions, Hotz says no fair.
1:12:00 Hotz raises the objection that the human ancestral environment was about competition between humans and AIs won’t face a similar situation so they won’t be as good at it. Yudkowsky tries to explain this isn’t a natural category of task or what a future struggle’s difficult problems would look like, and that our physical restrictions put us so far below what is possible and so on.
1:13:00 Hotz asks how close human brains are to the Landauer limit, Eliezer estimates about 6 OOMs. Hotz then asserts that computers are close to the Landauer limits and humans might be at it, Yudkowsky responds this is highly biologically implausible and offers technical arguments. Hotz reiterates that humans are much more energy efficient than computers, numbers fly back and forth.
1:17:00 Hotz asserts humans are general purpose, chimpanzees are not, and this is not a matter of simple scale. Yudkowsky says humans are more but not fully general. Hotz asserts impossibility of a mind capable of diamond nanobots or boiling oceans. Hotz says AlphaFold relied on past data, Yudkowsky points out it relied only on past data and no new experimental data.
1:18:45 Dwarkesh follows up from earlier with Hotz - if indeed the ASI were to create Dyson spheres first why wouldn’t it then kill you later? Hotz says not my problem, this is will be slow, that’s a problem for future generations. Which would not, even if true, be something I found comforting. Yudkowsky points out that is not how an exponential works. Hotz says self-replication is pipe dream, Yudkowsky says bacteria, Hotz says what are they going to use biology rather than silicon, Yudkowsky says obviously they wouldn’t use silicon, Hotz says what that’s crazy, that’s not the standard foom story, Yudkowsky says that after it fooms then obviously it wouldn’t stick with silicon. Feels like this ends up going in circles, and Hotz keeps asserting agreement on no-foom that didn’t happen.
1:22:00 They circle back to the ASI collaboration question. Hotz asserts ASI cooperation implies an alignment solution (which I do not think is true, but which isn’t challenged). Yudkowsky says of course an ASI could solve alignment, it’s not impossible. Hotz asks, if alignment isn’t solvable at all, we’re good? Yudkowsky responds that we then end up in a very strange universe (in which I would say we are so obviously extra special dead I’m not even going to bother explaining why), but we’re not in that universe, Hotz says he thinks we likely are, Yudkowsky disagrees. Hotz says the whole ASI-cooperation thing is a sci-fi plot, Yudkowsky says convergent end point.
1:24:00 Hotz says this is the whole crux and we got to something awesome here. Asserts that provable prisoner’s dilemma cooperation is impossible so we don’t have to worry about this scenario, everything will be defecting on everything constantly for all time, and also that’s great. Yudkowsky says the ASIs are highly motivated to find a solution and are smart enough to do so, does not mention that we have decision theories and methods that already successfully do this given ASIs (which we do).
1:27:00 Patel asks why any of this saves us even if true, we get into standard nature-still-exists and ASI-will-like-us-and-keep-us-around-for-kicks-somehow talking points.
1:29:00 Summarizations. Hotz says he came in planning to argue against a sub-10-year foom, asserts for the (fifth?) time that this was dismissed, Yudkowsky once again says he still disagrees on that but simply thinks it isn’t a crux. Hotz emphasizes that it’s impossible that entities could cooperate in the Prisoner’s Dilemma, and the ASIs will be fighting with each other while the humans fight humans. The universe will be endless conflict, so it’s all… fine?
These debates seem very hard to make progress in. They are entertaining, but I really wish there was some way of forcing someone to answer a specific objection. That mostly didn't happen the last time Hotz debated, and didn't happen this time either. They are somewhat entertaining, but I really wish that they would spend a lot more time finding a specific crux, expressing that crux as a statement, and debating that single crux instead of going through everything from replicating bacteria to compute limits to decision theories. I think multiple subjects are too complex for a debate format
I think you're going to have these circular arguments every time because the crux of the matter is that the hypothetical Doom AI is supposed to do a thing that neither debater can predict, with capabilities that no one can comprehend. To convince someone of something, they have to comprehend that your argument is right - if they can't understand how a thing will happen, you can't convince them of it. Well, you can convince some people, if they accept the principle assumptions that will lead to it - which is why I think the AI Doom argument makes intuitive sense to some people, but almost everyone continually comes up with "this is why it won't happen!" - but that won't be enough. There's only one way to determine if, starting from assumptions X Y and Z, if an unforeseen event will occur, and that is to test it.