32 Comments

I'm disappointed that your substack turned into AI Doom Daily. Your policy analysis was much more interesting and relevant. It's hard to see the doom movement as anything other than neo-Luddites who are trying to take a cue from Thunberg and turn the alarm up to 11 to gain attention. Musk joining in when his company is one of the top AI developers and is literally building humanoid bots is icing on the cake. Please stop before the govt gets involved and we can't have nice things in the future.

Expand full comment

As for point 3, "Whatever we do to align that AGI either works, or it doesn’t", I think there actually could be a way out (even though a really unlikely one). What if the "partially aligned" system doesn't really want to kill humans (or gain resources, or whatever, so that instrumental convergence doesn't apply), but could still randomly decide to do so with some low probability? I think an LLM-based AGI could be like that, if it is simulating a "helpful assistant" or something, but then randomly turns into a Waluigi. With this a system can be unaligned, but we can still survive interacting with it for some amount of time before an actually aligned system comes along. Sure, that's rolling the dice on human survival, but if someone's p(doom) > 90% of something, it can still be a viable solution

Expand full comment

My understanding was that SpaceX has an explicit policy to iterate test flights as quickly and cheaply as possible, and that they would have been shocked if this flight had actually gone perfectly. In which case it's simply false that Elon demonstrated an inability to get it right the first time.

Expand full comment

Sometimes your arguments about tactics and contrafactuals are hard for me to follow. You seem to have a lot of ready analysis of how people can react nonlinearly to government regulations and card game situations that I don't have. If you wrote them out at Matt Levine levels of detail, it would probably fill a small textbook, like the size of Fermi's Thermodynamics. I'd enjoy reading that book and I'd pay 10x a normal cover price if you wrote it.

Expand full comment
May 1, 2023·edited May 1, 2023

I appreciate the AI coverage, and it keeps me coming back now that COVID has been reduced to one of life's many problems.

I still think that p > 0.95 of sudden doom requires a bunch of assumptions that border on magic. Specifically:

1. Yudkowksy routinely argues that AIs will be able to quickly self-improve themselves to essentially godlike intelligence, either via algorithmic improvements or custom hardware. Even if it's possible to build "much smarter than human" intelligence, that doesn't necessarily make it easy for that intelligence to quickly build another generation that's effectively omnipotent and omniscient.

2. Drexlerian nanotech requires robustly building structures with 10^15 atoms, IIRC. The last time I followed the literature, I think we could place a single atom with a 20% chance of success, **in simulation.** (This was a while ago.) And while the idea is clever, I have seen very smart biochemists point out that Drexlerian nanotech fundmantally misunderstands the reasons why biology works well. See "Soft Machines" for a very old discussion of this topic. I strongly suspect that even super-human intelligences can't overcome these obstacles easily.

3. But if you write off Drexlerian nanotech, you need some other path for your AI to make GPUs. Either it can devote the time and effort into synthetic biology (which isn't as tidy or convenient as digital GPUs), or it needs to play nicely with the human economy, or it needs to recreate a late-stage industrial economy in a box. The last option, once again, is getting very close to magic.

So my "doom" scenarios _really_ don't look like "AI becomes a god, then it bootstraps diamond nanotech via an email to a DNA synthesis company, and everyone dies simulataneously a week later."

If I had to posit doom scenarios, they'd look more like:

1. AI participates in the human economy, and it outcompetes us in the medium term.

2. AI is super helpful, it helps us bootstrap a robotic economy, all while arguing for UBI. Then once it no longer needs us, too bad.

3. Some AIs value humans, but due to life-or-death struggles with other AIs, they can't afford to keep us around. Sorry.

But in each of these cases, we'd be looking at an AI that has initial incentives to seem friendly and to cooperate with humans, and for that state of affairs to persist long enough to totally rebuild the economy to no longer really need humans.

But this also means that we _might_ get a few shots at alignment. This still doesn't make it smart or safe to build ASI.

Expand full comment

I don't like how binary the conversation is here about killer AI and paper clip factories. Is it not possible to simulate possible dangers and study what went wrong? It's not like anything exists on earth to do what is imagined by the doomsayers, and we do know how to run simulations, yes? And using ChatGPT as the argument (ie. the GPT is the problem) is just a circular argument leading nowhere. Just build the safety parms in, why don't cha?

Expand full comment

As regards point 2... it is not so clear to me that a superhuman and misaligned AI spells certain doom.

We do after all already have superhuman (in some dimensions) AI, which is not particularly aligned, and so far it's not terribly dangerous.

The point being that this AI also faces the problem of 'getting it right the first time' - it needs to be superhuman in deception, in planning, in self-improvement, in taking over other systems and/or convincing people to do its will, etcetera (all while starting from a monitored box that can be shut down very easily). Progress so far has been a kind of one-step-at-a-time process of AI becoming superhuman at first this task, then another, then another. In order for getting it wrong to kill us all with certainty, the AI has to not just be misaligned and superhuman, it has to have (or be able to rapidly attain) a rather well-developed suite of superhuman abilities of different kinds.

In other words, it is entirely possible that we 'get it wrong' by creating agentic and misaligned behavior in some system that still has major weaknesses and blind spots, and that the resulting disaster does not kill us all, as we are able to shut the thing down. While, hopefully, drawing some useful lessons from the experience.

This does however depend on things we know little about. While I may disagree with Eliezer about the chance of doom, I think the policy recommendations of someone like me who might estimate it at 20% are not so different than his... it's not as if a 1 in 5 chance of extinction is an acceptable risk.

Expand full comment

All said and done, not even attempting to oneshot test rocket lauches is just the correct choice - to the point that it gives extra credit that he could one shot AIG. One shotting is not free - it's actually A LOT more expensive, on the order of 10x or more. Chosing the correct strategy in each case is a very low bar of competence.

EY made a good metaphor, but trying to treat it as an actual argument rather than a good zinger is, IMO, counterproductive.

Expand full comment

One thing to remember is that the SLS did succeed on the first try. More than a decade late and tens of billions over budget, but worked on the first flight. The first nuclear test also worked on the first try. I think Yudkowsky works agree it could be possible to do AGI right but we have to not have 100 companies working as fast as they can with little concern of the consequences.

Expand full comment