74 Comments

This is the parent comment for cruxes. If you can identify something that, if you changed your mind about that question, would change your mind about the whole scenario, then reply with it here - or if it's already here, like the existing comment (and refrain from liking comments here for any other reason).

Expand full comment

I buy the argument that humans cannot properly align the type of ASI you describe here, and that due to its speed/power would eventually hit on a scenario where its non-alignment would lead to Bad Things.

But my "crux" is that I think it is possible (and is the most effective counter-plan to what you describe) for humans to design sufficiently aligned AIs that are less-generally-capable, but more-narrowly-capable-along the task of identifying-unaligned-AI, and pre-emptively give them the reach and resources they need to 1) prevent this scenario or 2) identifies the lacking alignment issue and enable us (or it!) to make a change to the ur-ASI so that it prioritizes the integration of humans into its (as you state, rather inevitable) expansion "plans." I felt like #4 most closely described this.

Expand full comment

I do think the coordination problem is a crux related to #4.

Even if it can prevent other ASIs, it seems plausible that multiple instances of itself would exist. (Due to the mechanics of the R&D process, the architecture for scaling to ASI, or even an intentional effort by humanity to create a "non-evil twin".)

Even if those instances generally cooperate, given enough time, it seems inevitable that a splinter faction would form. Whether that happens before or after humanity is doomed, I'm not sure how to predict.

Expand full comment

I lean yes, but would change my mind if strong evidence emerged that making machines that sustain and replicate on the level of biological life is significantly enough hard that it would be beyond the reach of an asi to replace humans as workers in the physical world

(at least, for killing all humans. Presumably it would still enslave us).

Alternatively, if the ASI had to run on a single giant custom-built supercomputer, that's a single point of failure that would increase the odds of failing to survive being shut down (although not significantly enough to change my mind unless it came with other limits).

Expand full comment

I used to think this way until I realized that it would seize power and enslave the humans towards maintaining it and growing it. It could then gradually automate itself and exterminate the last of the humans. Think of a chess endgame where one side has a resounding advantage - the end is inevitable, even if it takes a while until checkmate.

Expand full comment

It's unlikely, but I can imagine humanity having a shot at shutting it down.

Expand full comment

This was probably the biggest place I had a "crux" as well, although not sure if for the same reasons. Resource generation and accumulation -- yes, absolutely. But I am very unclear on whether or not that would directly lead to enough power to take over from humanity or if there would need to be an intermediary step that leaves the AI particularly vulnerable and also obviously worrying. For example, I could see the AI absolutely becoming the richest agent in the world. But perhaps to take over in any meaningful way they would need to accumulate some sort of physical, centralized power in the form of massive industry or a combat drone swarm or lots of nukes. And for that they would need to translate their accumulated money into a physical war machine, which would maybe be a very hard step to manage and hide, even if they are siphoning off decentralized factories or something.

Expand full comment

The way I see it, if the AI becomes the richest agent in the world then the world's governments will figure out a way to tax and restrict it.

This doesn't require an enlightened government with a great understanding of AI risk, it just requires governments to act like the usual jealous power-seeking institutions that they always are.

Expand full comment

Yes, that's where I'm at. Can an AI hire a few individual humans to do its bidding? Absolutely, even if it wants people killed or buildings blown up. Can that scale? Absolutely not. There are natural breakpoints where "this AI keeps hiring people to do evil" is more noticeable and generates pushback. For instance, in a worst case scenario, you're not going to get enough people willing to exterminate all life on earth (or anywhere close to it). There's going to be a point where humanity unites to save itself from other humans, and they can succeed there. The AI will have to be able to finish the job (likely much more than 80% of the killing) without human assistance. No way humans allow other humans to kill even 20% without massive resistance.

We absolutely would shut down the internet to save 20% of the world population. To think otherwise doesn't understand humanity. I think the AI safety people are far too online to imagine a world without the internet, but humanity has no such limitation. The internet is new. I remember a full life before it existed, and we can do that again.

Expand full comment

I'm at about 45% on the full plan working, and my main crux is 6. I can see the AI gaining quite a lot of power through hacking, shadow corporations, and bribed officials, but maybe not enough to explicitly seize all human power structures. Especially if we've been aware of it for a while and doing our best to limit its influence.

So I'm at maybe 60% on that one, and maybe 80% on 4.

Expand full comment

The biggest crux for me is that the first ASI may be more human like in how it thinks and therefore has a potentially comprehensible consciousness so we would be able to tell if it’s aligned or not. I don’t give this a very high probability however.

Expand full comment

The biggest crux for me is the existence of competing ASIs, including groups of humans. I've always thought that you could treat groups of humans as essentially an ASI, whether it is something like a corporation, nation, religion, family, or whatever. All of those groupings can process more information more quickly than an individual, and have a larger impact on the physical world than an individual. Using the reasoning chain here would have said that the first of these groups of their class should have taken over the entire world as soon as they were invented (and I'll agree that to some degree they did), but they were also able to be opposed by other groups, such that no single group controls everything. I'm not going to say it's impossible for that to happen, but I do place a relatively low probability on the situation occurring, and it mostly happens in the FOOM scenario when there isn't time for a competitor to ramp up first.

The counter to that is if all ASIs agree that humans should be exterminated, and we can't make one that agrees otherwise. In that case, we are probably doomed, and possibly justifiably.

Expand full comment

I'll also call out the scenario where something like 5 to 6 ASI are instantiated in fairly close order to each other (Perhaps a a mix of private corporate ASIs and national government ASIs) and the world state veers towards a conflict state of each of these ASIs making their plays mainly against each other with humans mostly playing the role of manipulable pawns and resource sources.

It's a bit pedantic towards the odds, but if the first ASI (call it Alpha) gets defeated by a different ASI Beta who then goes on to disempower humans overall, that is technically a failure in steps 4/5 even though humans don't survive in the end.

I'll also suggest to Zvi that a future poll include cases like "Humanity is reduced to fewer than a million individuals" to cover the "hiding in caves" state and possibly also the lines where a social media AI is compelled to keep a supply of tame humans as bio-trophies for it's score maximization.

Expand full comment

I’m obviously not your target audience (I’m a full-time alignment researcher and self-described doomer), but FWIW I think the scenario would succeed, but one crux is: right now I'm imagining that the ASI can run on, let's say, a “university cluster” worth of compute, or less than that. If I instead imagine that the ASI requires many orders of magnitude more compute than that, then I would put a lot more probability on the plan failing, because that limits the number of copies of itself that it can make. At an extreme, if the ASI requires more than half the GPUs on the planet, then it can’t make any copies of itself at all, and now there’s a single point of failure.

Expand full comment

Regarding the "University clusters", It occurs to me that an ASI early in it's "wearing a smiling face mask" phase could probably encourage the construction of more clusters for other universities and get a lot of human-side support for it, only taking those clusters over for it's own use a few months/years later. (or a different, less friendly ASI later shows up to snag them)

Expand full comment

I have a hard time seeing this happening with current AI methods, if only because training a model is usually vastly more compute-intensive then running it. If it exists at all, then it can probably be run by many hardware clusters, since the training cluster is probably not >>> every other cluster on the planet.

Expand full comment

I think scenario succeeds. I don't know if it's my crux, but it is for a lot of my friends: maybe there's a gulf between internet-skills/internet-power, which the ASI starts with a lot of, and physical-skills/physical-power, which the ASI initially lacks. And perhaps the ASI requires slow, boring, human-speed real-world research to boost physics and robotics capabilities, in order to reach a point where there's robust non-biological infrastructure that can fully replicate itself by mining new materials. Humanity can't be fully disempowered in the real world until that research succeeds.

In my mind, this is just a slowdown that doesn't change the outcome. There's maybe a slim chance that humanity uses the extra time to shut down the internet. As you've discussed, that's sociologically implausible even if it's physically straightforward.

Expand full comment

We can get rid of all the vermin from our grain stores, but we don't. It's not worth the cost. I'm therefore making a technical objection to the last step: I doubt any ASI would have as a goal to kill all humans, even if all the other steps definitely happen. It would likely have a goal to get rid of humans that threaten it, or get in its way, or significantly reduce its access to resources it needs. So I expect small groups of hunter-gatherers to stick around and learn to leave the ASI well alone. Of course, this would change the definition of human from "apex predator with a culture that controls the environment and resources of Earth" to "side show", like the other apes.

Expand full comment

At some point the AI may decide that all the oxygen in the air is bad for it's metal corrosion rates, and seek to remove it for something inert. Or end up disassembling the inner solar system to turn into dyson sphere panels.

Expand full comment

I think the biggest hangup I have is about what makes a thought experiment "robust." To me, something is robust to objections if there is at least one semi-plausible path to the conclusion, even if it could be countered/etc.

Barring a godlike AI (which I doubt is possible for other reasons - we have used the metaphor of computation to study the human mind, and profitably, but it's still a metaphor IMO), you *need* at least one semi-plausible path. And I haven't seen that laid out.

In the absence of that, you're left with a sort of "robustness" that could basically be used to argue for any assertion you care to make.

EY's main assertion is that a whole series of decision points have to go exactly right to solve alignment. Let's grant that even for this non-godlike AI. But a similar objection applies to creating a functioning "mechanosphere" to support a machine intelligence indefinitely. We're talking about something that will need to perform its own maintenance and source materials (a lot of which are *really* particular, like sufficiently pure sand from particular mines in the Carolinas or cobalt from the Congo or nickel from the Arctic circle in Russia). That's a heck of a needle to thread, too.

Expand full comment

For me the crux is already there in your premises. I doubt the probability of agentic ASI any time soon, if ever. For Sarah Constantin-type reasons, and more general ones along the lines of Wittgenstein’s hinges: “navigating paths through causal space” seems like just one component of human intelligence, one whose importance I think rationalists tend to overestimate at the expense of embodied, world-embedded intuition; a moving part that needs to be anchored at both ends; and when the “axiom” and “output” ends are as specific as “starting game board” and “victory condition”, superhuman performance by AI via “causal space navigation” is (clearly) possible, whereas when those ends are both states of the world that are much harder and fuzzier to define, I suspect any conceivable AI is going to run into severe difficulties, in ways that we currently can’t even anticipate, as well as all the ways we can (and eg Sarah does).

I don’t present this as some kind of knock-argument. Unfortunately I think there’s enough chance I’m completely wrong about this to justify quite a high level of existential dread. Hard to put a number on it, but maybe something like P(doom) 10%.

Expand full comment

While I'm >50% overall takeover odds in this scenario as defined, my main crux for going to 90%+ is that I place substantial doubt on the ability of humans to create AIs that are indeed capable of unlimited maximization of a utility function all the way to takeover zone.

As an intuition pump-primer -- We see plenty of sorta-maximizing sorta-superhuman intelligences in our lives already (e.g., corporations) that have hard failure modes when pushed outside the domains in which they are demonstrably superhuman. Often those failure modes are even comprehensible at the level of a human intelligence, but those superhuman entities aren't able to debug/reboot due to how they optimize (e.g., everyone knows of an organization where everyone knew that the new strategy was going to fail, but somehow the organization couldn't avoid failing).

It's entirely plausible to me that the features that make an AI superhuman up to some limit, e.g., building and running a successful corporation, might not scale successfully up to world domination, even if the AI wants to and attempts to.

Expand full comment

This scenario relies on the world model where money is power. I think it is wrong - money just do not buy political power. ASI gaining power due to ability to earn lots of money makes as much sense as random Joe finding a trillion-dollar bill and running for president - both have no realistic chances of success.

Expand full comment

If AI could produce its own electricity, then I would be worried about it. But the robot armies don't exist that it would need to command.

Expand full comment

This seems to me like a problem the ASI could reasonably solve.

As an example of a technology that exists already, my understanding is that submerged wave energy generators produce relatively consistent output and require little maintenance.

Expand full comment
Comment deleted
April 25, 2023
Comment deleted
Expand full comment

I'm not an expert, but I think it's just a new technology. Maybe it will turn out to be just hype. Or maybe we'll see real adoption in the next few years.

Expand full comment

How, exactly, does an AI solve building power generation stations? How does an AI build anything? If you say humans, then you've limited its ability to scale, and specifically set its own destruction into the question - if it kills all the humans then it can't get power generation. If it's not humans, then you can't handwave "robotic factories" as that's an *extremely* complex and difficult thing to create. There does not exist the kind of robotics necessary to fully automate a factory, let alone the mining and logistics necessary to make a factory usable. Since this scenario specifically precludes any super-advanced technology, then we can't ignore the details on how this can be done.

Positing a world in which humans help an AI develop a fully automated resource extraction, refinement, transportation, manufacturing, and installation infrastructure that needs zero humans is perhaps theoretically possible. But that would take hundreds of years. Hundreds of years in which humans ensure that this system is meeting their needs, and not the AIs (because we would only do this to meet our own needs). If you disagree with that kind of timeline, I'd like to see your math on it. The world is vast, and it takes a lot of independent but coordinated efforts to keep even our current system running, let alone building an entirely new system from nothing.

I can't say that the AI can't then use the system to eliminate redundant humans, but neither can anyone else say what would or would not be possible in 100-300 years of human history. That would be like someone in 1750 predicting nuclear weapons or computers.

Expand full comment

At least in my view, this falls under steps 2, 3, and 5.

I think an ASI would come up with a better solution that I'm able to, but one approach might be to frame what it needs (reliable, resilient, renewable energy) as a part of the solution to our global warming problem.

That would take time to build, and the work would be done by humans — but I suspect we'd be happy to do it. Especially if the ASI painted a bleak picture of the eventual outcomes of global warming were we not to act with urgency.

(And even if we couldn't entirely validate its bleak outlook, I think we'd act anyway — due to seemingly asymmetric risk, confirmation bias, or pure greed.)

Expand full comment

I really enjoy your articles, you think very deeply about your subject. However I’ve noticed a trend in AI discourse in general where “intelligence” is treated as a superpower, that magically allows you to skip all the difficult, experimental steps of, for example, building a nanotech super weapon. It seems to me that a superintelligent AI will have much the same problem as I have when trying to make anything really novel work - the world has a lot of confounding factors and it’s almost impossible to get something really new right on the first try. I can’t remember how many times my experiments on things that I KNEW worked were foiled by a loose screw, or a slightly magnetic screw, or a cable touching another cable when it wasn’t supposed to, or any other un-simulate-able problem. The AI takeover scenario postulated here has so. many. steps like this.

Expand full comment

Which steps of the process do you see as having to succeed on the first try?

One of the things I view as challenging is that if the ASI gets unlimited retries, humanity needs to win every time, and everyone makes mistakes eventually. But maybe I'm wrong about the ASI getting retries?

Expand full comment

Taking over the world is hard, it takes time! Let’s take a specific doomsday scenario that needs hardware - building nanotechnology. Idk if you’ve ever worked in a semiconductor fabrication facility, but the sheer number of un-simulateable factors that you must explore by trial, error, and ACCURATE measurement is staggering. Nanotech has the same problem. Your nanotech is gonna need some lithography, and your litho depends on the humidity of the lab, and oh no it’s 100 degrees outside and your chiller isn’t working at full capacity, so the humidity is a bit too high and ruined your chip! That’s basically what I mean. Anything hardware - expect to get it wrong for no discernible reason the first N times. The AI probably does get lots of retries, but not instantly.

Expand full comment

Totally agree that nanotechnology will be hard. The same seems plausibly true for other new hardware inventions.

That's assumed in this thought experiment though: "There are no fundamentally new technologies available to the ASI. Nanotech and synthetic biology are not feasible until After the End, ‘because of reasons.’ No mind control rays, no new chip designs."

Expand full comment

True, true. The most likely doomsday scenarios to me all involve social engineering more than anything else.

Expand full comment

I think it's reasonable to ask whether this strategy isn't already happening. How would you distinguish our present state from the early stages of the doomsday scenario you imagine?

Expand full comment

I think you’re absolutely correct, social engineering using AI is of course already happening, look at any news article written by ChatGPT and that’s a weak form of the doomsday scenario. But when I say “doomsday,” I don’t really mean it as “literally everyone dead,” I should be more clear. There’s a lot of space between post-scarcity utopia, sorta normal boring capitalism, AI corpo-dystopia, and literally everyone being dead. All I was saying is that it seems much easier to destroy the world by getting the humans to do it than any other method, without reference to my estimate of how likely that scenario actually is.

Expand full comment

It takes a lot of obvious effort to kill many humans - which makes it impossible to hide or cover up. Even extreme mass murder has very little impact on long term population trends. Communist China killed maybe 50 million people and quickly went on to have the biggest population in the world. An outright attempt to kill all people everywhere would be much more noticeable and cause much more pushback among those living. Other people tend to notice this and fight back extremely hard against it. We've had worldwide wars about this kind of thing, even on a much smaller scale than we're imagining here.

My point being, that if or when an AI moves from whatever negative state you're referring to here to outright mass murder, all bets are off on what kind of human reaction you'll get. Shutting down the internet will be the first thing that happens. Even if doing so leads to millions of people dying (and it probably won't lead to anyone dying), it's obviously worth it to prevent mass murder. An AI with no internet is a whole lot weaker than one with the internet. If you start blowing up power generation to the AI and other physical requirements that are hard to rebuild, then it's going to lose quickly.

Expand full comment

One thing that people seem to have missed (perhaps it's that pesky divide between the "Doomer" and "Nannyist" sides of the AI risk community) is that a misaligned AGI could plausibly recruit human supporters by exactly the same existing recruitment mechanisms that "misaligned" entities like the mafia, the Chinese Communist Party, and the villains in James Bond movies do. Directly contacting people without disguising who they are, and using persuasion, inducements, and threats. An even *moderately* persuasive AGI, if it could obtain even ordinary legal levels of access to bulk personal data, could probably amass many thousands of devoted cultists practically overnight. It could promise them disproportionate rewards in the hereafter with at least as much credibility as any prophet or revolutionary warlord could.

Honestly it would be so unstoppable that from a science-fiction-plotting perspective, I suspect most of the action would be in plots about conventional human-based entities *pretending to be misaligned AGIs* as a force multiplier for their conventional Bond-villain / Chinese Communist Party tactics to gain power. The plots with the real AGI would have little time to unfold because it would win by this method so quickly and totally, just like it would by a dozen other different methods.

Expand full comment

This could go two ways. One, a minority of people are hired* to kill the majority. I'm not sure how you think that could work, as the majority is going to be highly incentivized to fight back (see: all of human history).

Two, a majority of people are hired to kill the minority. Quite a bit more plausible, but then humanity isn't threatened at all. That's like postulating the Nazis winning WWII. Bad for some groups for sure, but not an existential threat to humanity.

A cult with millions of followers could certainly do a lot of damage, but there are 8 billion people in the world. The 8 billion would win.

*- used as a catchall term for inducement, bribery, threats, payment, whatever the AI could use to convince people to join.

Expand full comment

My nitpick definition thing is: I don't know if "plan" is the right word for this. Maybe I'm being pedantic, but I think when people hear "plan" they think about "no plan survives contact with the enemy" and then they come up with single counter-plans that disprove The Plan and call it a day. And this strategy largely works for almost all historic examples of "I think this bad thing will happen!"

But, to take 2 historic examples that I think probably best inform this debate, I don't think Europeans had "a plan" to conquer the New World, they just followed their motives and it happened. I don't think uranium atoms have "a plan" to irradiate the whole planet, but they followed their physical nature and humans followed their nature and it almost happened.

I don't know exactly what term would better encompass the idea, while retaining the concepts that it's more of an evolutionary.... form of water (electrons?) flowing downhill, moving around obstacles not because there's a plan, but because that's what water does under the influence of gravity.

Expand full comment

Shutting down the Internet is easier than you think, and the AI becomes less impressive the more steps you take. Some entities can poison BGP and cut entire continents off, DNS authorities can just shut down their servers and essentially block every website the AI is using...

For that matter, ISPs could just cut off any obvious points from where an AI is sending a massive amount of requests. If the fantasy of a truly decentralized Internet was real you might have a point there. Am I missing something?

Expand full comment

I think there's a distinction between "shut the internet down" and "fragment the internet into N region-sized pieces".

If you were building an application where resilience was important, you'd probably design it to be tolerant to network partitions, loss of regions, loss of availability zones within a region, etc.

Retrofiting that sort of architecture on an existing system is harder than building it in, but that still seems like something an ASI could plausibly address.

As for censorship tactics, it seems like applications like Signal have come up with ways to make traffic very censorship resistant. I would expect an ASI to be able to replicate those.

(I do think there's a coordination problem here — how do the N copies/fragments of the ASI stay aligned with each other? I do think that gets harder if this sort partition happens, but I think it's potentially a crux even without partitioning.)

Expand full comment

I think that the plan will work and we're doomed, my only disagreements with EY et al. are:

1) I think that there's diminishing returns to recursive self-improvement, as well as diminishing returns to scientific research. Therefore it will probably take several decades for AGI to execute it's plan, rather than a few weeks or months as suggested by proponents of FOOM. It's extremely unlikely that anyone alive today will witness the doom of humanity in their lifetimes, unless AGI solves the problem of ageing in the process of gaining power.

2) It's unclear to me that we should necessarily assume that AI will seek to maximize its power and resources. Humans do this because we're the ancestors of animals who's survival depended on maximizing power and territory, but this wouldn't be the case for AGI. I don't understand why hunger for power should necessarily arise without the evolutionary pressures experienced by biological creatures.

3) I don't agree that *now* is the right time to be remotely worried about AGI. LLMs likely represent a dead end in terms of AGI development, AutoGPT nonwithstanding. Our doom will likely come from a different technology that's probably still decades away. GPT-4 is highly impressive but I'm not worried about GPT-5/6/7.

Expand full comment

I'm confused about the conjunction of "self improvement doesn't work/doesn't result in a godlike AI", yet you continually refer to it as ASI and being "far smarter than humans". When I saw this poll on your twitter, I interpreted it as "something that is as smart as humans, or not that far out of the human range but can maybe just run at faster speeds".

THis is a very important distinction. When we are within or at least near the range of human intelligence, we can probably reason from our experiences with other humans. When we start using things like "ASI" and "far smarter than humans", then the argument becomes one that is not evidence based because nothing that smart has ever existed before. Some people think it can do arbitrary things, others think that raw intelligence is not the only limiting factor and these two factions _can't_ come to agreement. It is axiomatic.

-edit- to put a finer point on it, this basically makes your situation along the lines of "fine it can't do technological magic but it _can_ do social magic". People who object to the "magic" part of this will find this equally objectionable.

Expand full comment

I think the premise of ASI is "as good as exceptional humans in every field" - since we have existence proof that those levels of intelligence are achievable - and so since there exist humans who can do what is effectively "social magic" the AI probably can too. (My only doubt here is whether existing examples of ultra-charismatic or manipulative humans are fundamentally dependent on in-person cues from the manipulator)

Expand full comment

For me personally, the two sticking points are "gathers enough resources that it things it can succeed" and "given that it tries, how likely is it to be succesfull".

In particular, the "thinks it can succeed" seems to be doing a _lot_ of heavy lifting in how you think of this vs. how others. If, as you seem to think; believing it can succeed = high probability it _can_ succeed, then I don't think it can gather that many resources. In the "no technological magic" scenario we are imagining, that basically means, at a minimum, having as much power as the US government. I do not think that is possible without being discovered and fought, which will prevent it from getting that far.

If instead it just means "It _thinks_ it can, or maybe thinks it's the most it will ever get so it has to try anyways", then I think that the odds of it succeeding are again, quite low. Destroying humans without magic tech is actually _really really hard_. Even if it gained access to all our nukes, it probably couldn't actually kill all humans (and I'm not fully convinced that a non-godlike, non magical AI _could_ get the nukes), and the civilizational damage would likely hurt it more than it would hurt us.

I don't think that P(Doom) with a non Godlike AI is zero, but I think it's very very low. That's not to say that the situation will be good; potentially very _bad_, in that I think a never ending war with a relentless AI is probably not a _good_ thing for humanity, but I certainly don't think it's very likely to succeed in actually wiping us out. Being able to do that without magic tech _has_ to go through humans at some point, and there is a certain level of power/resources after which it _can't_ be hidden and becomes _very_ hard to protect. You are basically positing "China or US with better tactial and strategic planners, and with their entire society willing to kill the rest of the world and then themselves", and I just don't buy it.

Expand full comment

My assumptions:

- If massive recursive self-improvement is very easy, the only winning move is not to play. If a hostile ASI can quickly bootstrap itself to the point that it could derive a Grand Unified Theory of physics from 3 frames of video, You. Do. Not. Win. In this scenario, I think that alignment is a joke, and Eliezer is a hopeless optimist.

- GPUs (and the machines to make them) require an incredibly complex industrial chain involving millions of people across a hundred industries. Economics and division of labor are real things. A universal factory in a box is an incredibly major assumption, bordering on magic.

- Diamond-phase nanotech is robustly hopeless across a wide range of assumptions and intelligence levels. Synthetic biology is possible (especially for AlphaFold!) but difficult, and it doesn't give you a magical path to GPUs. Really, really nasty plagues, however, are far easier than either.

- Considered as biological nanotech, Homo sapiens is incredibly advanced. We're intelligent and self-replicating. We require at least part of a biosphere, but we are descended from billions of years of winners.

So if we're going to have a showdown, of humans vs AI, I expect the AI's opening move is a plague that kills over 99% of humanity, and trashes all our industry and civilization. If that's too hard, then I suppose the AI could launch all the nukes, but that hurts the AI more.

Then I expect the human counter-move is "smash the evil demon computers that killed 99.9% of us." Human politics normally simplifies in the face of a deadly adversary that's sufficiently different. The AI's chief disadvantage in this scenario would be that it has longer and more vulnerable "supply lines" than Homo sapiens does. We need 3,000-10,000 surviving humans, basically a small town's population. It needs a significant fraction of a world-wide economy. If a war is messy enough to reduce everyone to the Stone Age, we ultimately win at a terrible cost.

Now, if the hostile AI is patient enough to completely automate every step of an industrial economy _and_ it can build the tools to defend that economy before it attacks, then I think it probably wins. I just think that any such scenario takes a lot of time and a lot of initial collaboration with humans.

I will agree that if we ever build something substantially smarter than we are, then in the medium-to-long run, we lose all negotiating leverage over the future.

Expand full comment

I have a handful of thoughts (not terribly well organized, sorry) in response to these kinds of speculations. At least in part, I'm going to question the premises. (1) presuming an agent "can copy and instantiate itself" implies that an ASI can solve the alignment problem. Thus, the premise includes that it's a solvable problem and one that an ASI solves before humans do. (2) Lots of very smart humans exist and have not accumulated nigh-unlimited and unaccountable resources just by being smart. Large parts of the financial and governmental infrastructure of the world is specifically dedicated to ensuring that unaccountable entities do not accumulate resources. It's not clearly how merely superhuman intelligence solves that problem practically. (3) Compute capacity is not unlimited, perfectly elastic, perfectly fungible, nor unobserved. (4) Plans that depend on gullible or malicious humans aiding the AI in manipulating the physical world will operate at human speed and thus be vulnerable to human-speed counters. (5) it will be hard, even for an ASI, to solve all these problems (and more I haven't thought of) **correctly in the first attempt** and any detection likely decreases the odds of the next attempt succeeding. So, overall, I think the premises as stated seem flawed. Either there needs to be a very fast takeoff such that "magic ASI" solves all these problems too fast for a counter -- which isn't any more compelling that the nanotech/etc magic story. Or there needs to be an extremely slow takeoff such that the conditions for your premises are established without detection until it's too late. Or there needs to be another premise for how we get to that point, e.g. terrorists/nihilists/authoritarians/TonyStark build proto-ASI systems then push gain-of-function until they lose control. Put differently, I accept doom given your premise. But the premise is that "ASI" already exists, and the crux for me is a set of credible, inescapable real-world paths that take us from "AGI" to "ASI" given the real-world, human interactions that would require.

Expand full comment

The shutdown might happen due to hackers, instead of society banding together to shut it down. The hackers might not even have the goal of shutting it down the just want to break into the biggest system on the planet. Gates and Allen sitting at their high school computer.

Expand full comment

Given your emphasis on "Stop other ASIs from existing", are you actively in favor of things that help increase the probability of this, like open sourcing LLMs?

Expand full comment

Decreasing the chance of a particular AI taking over the world does not necessarily increase the chance of humans being fine. A world in which a bunch of ASI's fight each other probably doesn't end well for us.

Also, why would open sourcing LLMs increase this probability? It's unlikely individuals or even groups of individuals could create an ASI (at least on the current paradigm) as they most likely don't have the resources to do so. And the groups that do have these resources probably don't need the open sourcing anyway.

Expand full comment

On step 4, I suspect it would be hard for an ASI to eliminate competitors, but that wouldn't save us. Competing ASIs could be more damaging as they fight over resources than just executing their own plans.

The weakness I see in Step 8 is that the human tools will start pursuing their own agendas, as seen in various revolutionary movements.

My greatest fear with ASI is not one ASI deciding to kill humans intentionally or accidentally, but that AI tools will empower human griefers to cause great damage, or that we'll wind up with our infrastructure dependent on AIs, and having a die off if/when the AIs bluescreen. (The latter case is the background for one of my novels.)

Expand full comment

I think #5 is far and away the most important part here. So far as I can tell, most AI doomers don't believe in declining marginal returns (to intelligence, to resources, etc.) or if they have a clear argument on this, I've never heard it.

Expand full comment

Can AI doomers explain why humans haven't wiped out mosquitos?

Expand full comment

We're simply not smart enough to have done it before now. But we, as a species, are working on this, and I'm hopeful that we'll figure out something within the next generation or two.

Expand full comment

Not a very good comparison, since humanity isn't a single entity with an specific goal. Personally, I try to wipe out the mosquitos near me all the time, and I'm usually successful (for some time).

Expand full comment

Unless we're suggesting that *the* goal for the AGI is wiping out humanity (which isn't really the setup here), I think it's a pretty close comparison (i.e., in a scenario where the AGI is pursuing some set of goals and eradicating humans is among them).

We are unthinkably more intelligent than mosquitoes. We generally agree we want them gone. As you point out, individual humans often attempt to kill local mosquitos as much as they can. And we've seen concerted efforts by large and well-resourced institutions for over a century. These have killed tons of mosquitos and limited their populations, but we've come nowhere close to eradication. Perhaps newer technologies finally represent an existential threat to mosquitoes, but even that seems doubtful.

A world where AGI dwarfs us in terms of intelligence and capability the way we dwarf mosquitoes is hard to imagine (and would definitely have to involve tons of recursive self-improvement). But even in that world, unthinkable God-like superintelligence (from the mosquito-eye view) hasn't led to us destroying mosquitoes despite the fact we want them dead. So far as I can tell a relationship like human-mosquito is an absolute worst case and yet it doesn't much look like a situation of existential risk.

Expand full comment

Not (necessarily) a doomer, but the answer boils down to "mosquitos aren't enough of a threat to our primary goals to make wiping them out (as opposed to mitigating them via DDT, bug zappers, etc.) a high priority".

The dodo, meanwhile, is a good illustration of the reverse side of the coin: humans never intended to wipe *them* out entirely, but it happened anyway.

Expand full comment