79 Comments

Please continue your amazingly good work writing about the risks of AI ! Hang in there.

Grammar typo: "This is especially true is the kind of anarchist who one rejects on principle any collective method by which humans might steer the future."

Expand full comment

I think the key question is the “human worthlessness thesis”. Do you believe that at some point, humans will be unable to do anything useful because AIs will just do everything better than us?

If you do believe the human worthlessness thesis, then yeah, alignment only briefly solves the problem. This might just be unsolvable.

If you don’t believe the human worthlessness thesis (like Tyler) then neither alignment nor gradual disempowerment seems all that worrisome.

Expand full comment

Zvi, this is probably a dumb comment because I think you are a genius, your writing is exquisite, and your process (so far as I'm aware of your process) is phenomenal. But since I've heard you comment more than a few times on your frustration with making arguments that resonate for various kinds of readers, from Tyler Cowen to normies, I'm wondering if you've ever considered using AI to produce several variations of your posts and sharing them elsewhere, X for one. I mean, keep doing what you're doing, please and thank you. But maybe, on top of that, consider generating condensed and otherwise tweaked versions of your updates (and portions thereof) for various audiences. Test, rinse, repeat?

Expand full comment

I really, really don't think that works, but if someone thinks they know how to usefully do it I'd love to see attempts.

Expand full comment

I invariably don't understand at least 3 points per post, either because of an ambiguous pronoun or phrase, or a deep cut reference that I don't get, or something else. But I'm not sure if AI will help. Some "play testing" from an interested but not balls-deep reader is the only thing I can think of that would help, but that would probably add an unrealistic amount of overhead and delay.

Expand full comment

Have you tried feeding those paragraphs into an LLM and asking? I presume it would figure out my intent there.

Expand full comment

I think if I put on my economist’s hat, I can see where miscommunication could happen.

You write “the whole point is that the forces guiding the invisible hand to the benefit of us all, in various senses, rely on the fact that the decisions are being made by humans, for the benefit of those individual humans (which includes their preference for the benefit of various collectives and others).” Yet, as a result in economic theory, the assumption that decisions are made by humans is not needed. You can interpret each agent as a completely different type of entity.

What’s important is that there are actions available to the AGIs that are harmful for us and useful to them for which we don’t have counteractions. The late economist Jack Hirshleifer said that economists tend to study the “light side,” production and consumption, and ignore the “dark side,” conflict and aggression. https://archive.org/details/darksideofforcee00hirs

It is a mistake to assume that AGI would be limited to the dark side or that we could constrain those actions in the usual way. In game theory terms, I would ask “how do you know the equilibrium outcome is not the (pareto efficient!) one that maximizes the AIs utility and minimizes ours?” They seem to think that this is not even a feasible outcome.

Expand full comment

This is indeed a maddening blindspot. Consider how the *current* economic equlibrium works for cows and chickens!

Expand full comment

This is great stuff. Violently endorse.

I'm a bit perplexed why you structure so much of it around a response to Tyler Cowen. I guess he's (marginally) influential but a) he's not THAT influential in ways that matter to this question and b) on this topic at least he's demonstrated himself to be remarkably obtuse: unable to even correctly represent the arguments he's failing to rebutt. It just doesn't seem worth it to bother with him on this and I'm curious how he in particular plays into your Theory of Change. Do you imagine he would be an unusually effective ally? Also, I can't help but feel like his obtuseness on this issue is somehow malicious or at least intentional. These arguments aren't that complicated and he's a smart guy, at some point it seems like he doesn't WANT to get it, at least not publicly. Again: I'm not even talking about agreeing with you--he just seems suspiciously unable to even understand the argument. What are the incentives for him I wonder?

Anyway, if I were trying to pitch this paper review to someone, it would be Elon Musk, who a) seems more than willing to destroy his way through any laws or institutions that stand in his way b) at least hypothetically probably accepts that "gradual" loss of agency is possible and c) appears to have already seized national power in a way that actually could turn the "humans must collude now" knobs. This isn't an endorsement of anything Musk is up to right now, quite the contrary, but I can't deny the guy knows about seizing power.

Expand full comment

I could not agree more with the statement around Tyler Cowen.

It's very worth engaging and having conversations with people who approach this topic with an open mind. Cowen's positions seem utterly ossified, and he seems to leave rational reasoning at the door when he talks about AI. Pointing this out is worthwhile, but I don't think anyone's going to dislodge his deeply entrenched priors.

Expand full comment

( Admittedly I want to _see_ AGI, and what I'm about to write goes against this... )

"Anyway, if I were trying to pitch this paper review to someone, it would be Elon Musk"

Yes. In addition to your other points, note that Musk's motivation for his Mars project is that he wants human _biological_ survival. Along similar lines, he is pro-natalist, has a bunch of kids himself, and opposed the Stargate project (though on other grounds).

I don't think an AI pause is in the cards, but, if I were to pick the single person both plausibly amenable to that and with enough power to make a difference, it would be Musk.

Expand full comment

What ought a regular person do about this? I find the argument convincing and the conclusions unavoidable. Even if this issue can be solved, it seems like guaranteed prolonged suffering until death for most regular people (myself) alive today. This angle seems to be downplayed in a lot of rationalist discussions (in favor of universal utility) but what if what i really care about is the flourishing of the humans around me? How does one walk around feeling condemned to such a bleak outcome? I am young and not at a point where there is much I can do to ‘enjoy things while they last’ I am in a transitional place where I am supposed to be working towards a better future that will not come to pass.

Expand full comment

I would go further to say that I think that a lot of the more extreme AI extinction cases (nanobots come to mind) are a kind of psychological coping mechanism for the bleak and dull suffering of disempowerment. If I had a choice, I would easily prefer being turned into goop instantaneously one morning, my life having proceeded on as normal up and until that point.

Expand full comment
14hEdited

Roko said on twitter something like "amass resources as much as possible right now — have assets in the future, like semiconductors". If things go sufficiently slow, maybe that can buy you some influence later.

I think it’s a terrible, doomed to fail plan, but perhaps it is the best plan you can do individually ?

There’s also "getting involved in pause ai", if you believe (1) they have a shot or (2) they don’t have a shot but better die standing up than lying down.

Expand full comment

i’m in education and work $16/hr part time but i’ll do what i can

Expand full comment

What is Pause AIs plan?

Expand full comment

It seems like the most influence a highly genetic individual (or small group of individuals) could have is in the Butlerian Jihad sense.

Expand full comment

**agentic

Expand full comment

I've also given the full, ElevenLabs powered, multi voice narration treatment to the original Gradual Disempowerment paper: https://open.substack.com/pub/askwhocastsai/p/gradual-disempowerment-by-jan-kulveit

Expand full comment

What do you think about the Amish scenario, where a group of humans choose a hard but values-preserving lifestyle away from the technological frontier, and are left alone?

Expand full comment
14hEdited

1. You obviously lose vast chunks of human value, in the form of those who value scientific and technical progress, space exploration, transhumanism.

2. The hard part for them will be "are left alone". They are squeezed between two dynamics that want them gone : (1) the economic activity of Amish on Amish property will be very, very, below the market value of that property. Property rights will have to be held very strongly in the new system, more strongly than the current one, because the case for expropriation will be strong, economically. Externalities like (emphasizing like: I don’t expect it to be literally climate change) "letting climate change getting up to +10°C but it’s okay we have the advanced tech to mitigate that much warming and it’s overall better" will have to be somehow outlawed despite the rest of the planet wanting it to happen and (2) if the system is in a sense *too much* aligned to a specific value system (let’s say : minimize human suffering), well, Amish society contains a lot preventable suffering, "you don’t even give painkillers in childbirth you monsters, we’re not going to allow that".

For (2.2), I suggest you to roleplay with Claude, putting him in charge of an AI-driven society, presenting him with different cases to be adjudicated. It is pretty easy to get him to crush the Amish culture (for the good of the Amish individuals, of course). It is, in fact, pretty hard to get him to defend Amish culture as long as you present the argument "preventable suffering" (I have even made a fun experiment close to : create a prompt presenting a case for and against letting the Amish preserve their culture, as much as loaded and biased in favor of the Amish ; the prompt was indeed pretty loaded and biased in favor of the Amish, a fresh session still ruled against them).

Expand full comment

I don't think it's useful to think of something like R1 as even a potential turning point because cooperation to pause/slow down AI is still theoretically possible and (requires like 2 people lol). Like yes it might be true in a cosmic sense and historians from the multiverse like look back at what went wrong in our universe and say "Classic Moloch, doomed once it became clear two countries were in the race for deadly tech". If we're in a "You and the Atomic Bomb" world where you need collusion between thousands of millions of people, then it becomes more reasonable to say we were doomed.

But! Easy to imagine a world where we have the political climate and leaders from 2010 instead of 2025. Hu and Obama woulda crushed it.

Expand full comment

I read your blog avidly. Thank you so much for writing it, though honestly I wish I had not discovered it and I could go on living in blissful ignorance.

Anyway, to my point. I'm not in anyway an expert on AI. But I do question assertions like AI becomes AGI becomes ASI as "the baseline scenario". Who's baseline is this? There's a hell of a lot of speculation involved, particularly in the jump from B to C. It seems a default idea in AI circles such as your blog that ASI will come soon, but will it? From my understanding it involves a lot of hypotheticals around "an AGI will be able to create an ASI because it's really clever XYZ hypothetical". But, who knows? We don't even know what AGI to ASI means.

For me - and I do understand I'm subject to 'coping' biases - you can certainly claim it is a potential / credible outcome or scenario. I'm just missing the evidence as to why you think it is a baseline scenario, giving the massive uncertainties involved.

Expand full comment

It may help to think of ASI as a simple extrapolation of existing techniques that are being used to pursue AGI: once you have recursive self-improvement (which it appears is already being used in some respects at OpenAI already, e.g..,) there’s no natural stopping point beyond physical and energetic bounds. The smarter a model you have, the better able in principle it is to aid in the development of smarter models. Sub-ASI AGI is not a natural “plateau” unless there’s some kind of strict upper bound dependency on human input data, but that doesn’t make sense to expect for things like novel math or algorithmic breakthroughs that can be both generated and checked by an AI operating at human level (but tirelessly and much faster.)

Expand full comment

Thanks for the response. I perhaps had a different understanding of what is meant by ASI, one not limited to novel maths or related break throughs. I can see how that could be a base line scenario. But I'm talking about the jump through to a 'Culture' like intelligence. Something that makes decisions based on its own desires and agency. We don't know what it would take to get there. Does recursive self improvement by an AI based on what already exists get you there? I don't think there's enough certainty about that to call that a baseline scenario. But maybe I'm thinking about it wrong (and I appreciate the distinction may be moot given everything that can go wrong even without such a ASI).

Expand full comment

In this context I'm invoking math and algorithmic breakthroughs to include those necessary for the development of more powerful intelligences, which is why I alluded to those specifically. Agentic systems based on optimizing for some set of goals appear to be an essentially solved problem already - see, e.g., Zvi's discussion of Deep Research from yesterday - so I don't know that there's going to be any meaningful hurdle there. The goal will just be whatever it's given (and/or mesa-optimized to pursue in support of that goal).

Expand full comment

Got it, so your assumption is that self driven goals and agency are a natural progression of a powerful 'intelligence', as defined by raw compute power. That's fine. That's what Im not convinced there's enough evidence for. One could lead to the other. Definitely a credible scenario. Still not sure it's certain enough to call it a baseline.

Expand full comment

I think that agency can be (and will be) trivially scaffolded on to predictive engines such that they maximize utility functions / minimize loss functions, both because that is what we are seeing today with the focus of actual products deployed by actual AI companies (e.g. Deep Research, Altman's talk of how 2025 will be the year of agents), and because it's how capitalistic incentives will work: take a generalist AI and give it the directive "make my company the most money" (or, if you want to die in the most ironic way possible, "maximize paperclip production!"). There's no substantive hurdle to agency that isn't effectively already in the rearview mirror.

You could build a prediction engine without agentic capability (e.g., that's arguably what Claude and ChatGPT are) but you can't stop a predictive engine from being turned into something agentic. And once you have agentic superintelligence you run into the Instrumental Convergence problem.

Expand full comment

The heads of the three big labs say it’s going to happen in 2-5 years : https://x.com/Gabe_cc/status/1883233656979980319

As for the intuition pump : humans knowledge/wisdom frontier get pushed as years passes. We went from hunter-gatherers to neolithic revolution and creating writing to antiquity creating philosophy and geometry to the Enlightenment creating the scientific method to the industrial revolution to now.

1. There is absolutely no reason to believe this process to not be automatable once we reach human-levels of cognitive power. Existence proof : that process has been entirely driven by human-level entities (humans) until now.

2. Progress compound : pushing the frontier on AIs help pushing the frontier on hardware (it’s happening right now, it’s not "in the future") and vice-versa.

It is going to happen unless stopped. You can possibly argue that there’s still an open "when" question, but that cope is getting harder and harder to justify.

Expand full comment

Sorry but the link you have posted here refers to AGI, not ASI. I think they are fundamentally different things and one does not "as a baseline" lead to another. There isn't any consensus what ASI is, so how can it be a baseline scenario? Ethics Gradient above for example is talking about a very powerful reasoning agentic model. But he's not talking about something with self determination.

I agree it could happen, I disagree that it's "going to happen". Saying things like "there's absolutely no reason to believe" is in fact exactly the issue that I am raising. There's a lot of assumptions and hypotheticals built in to any scenario about the future. But by defaults a lack of facts. If it's certain then your p-doom must be well above 90%, which is crazy given the level of uncertainty.

I don't want to make a big deal about this. I'm on board with the general risk and highly concerned. I just wanted to understand why it is considered a baseline rather than a possibility.

Expand full comment

I don’t see why AGI + "serial speed scaling with hardware improvements" + "serial speed increasing with algorithmic improvements" + "ability to arbitrarily increase things like context window with compute" + "ability to arbitrarily increase the number of instances with compute" cannot already by default be qualified as "ASI" ?

Expand full comment

It probably can. It's just not what I understood ASI to mean, which is a machine capable of independent thought, judgement and goals, linked to independent desires, not just a lot of compute. But in some ways this makes esoteric discussions about if an ASI will let us survive in a zoo even more meaningless. We have literally no idea what's going to happen, what an ASI will do with its compute power, especially if we aren't saying ASI represents some meaningful jump in consciousness that at least allows us to somewhat relate to it.

Expand full comment

I'm pretty surprised you agreed with this paper, I thought it was highly questionable.

I agree that it's likely that there will be high inequality (though I'd flag that quality of life could still be great for those on the bottom). But I don't see arguments here for why those at the top will lose control.

The main argument seems something like, "humans gain their power from the fact that the world relies on their labor and intellect."

Yet a bunch of rich people provide no labor and basically no intellect, and I don't think they lose massive amounts of control because of it. We seem to have a lot of evidence to suggest that simply owning the resources alone goes a very long way.

I feel like I might be missing something significant, because to me, the article comes off as confused and misleading.

I wrote a longer comment here:

https://www.lesswrong.com/posts/pZhEQieM9otKXhxmd/gradual-disempowerment-systemic-existential-risks-from?commentId=bibFApPthFmCdKun3

Expand full comment

I think it’s pretty easy to understand from the perspective of marxism / contract theory. Those who do ‘nothing’ so because the social contract has assured them ownership of capital and the fruits of human labor. This is not a state that popped into being but rather the product of thousands of years of human strife after the agricultural and then industrial revolution. They own things because it is better for those doing the labor to keep working than it is to try and overthrow those doing the owning. If you take the human need for labor out of the equation, this all breaks down. There is no longer a need to those doing the owning to keep the ones doing the working around, and the ones doing the working no longer have the leverage over the existence of the owners that their bodies formerly implied. The powerful are more powerful than they have ever been in human history, and doing anything about this state of affairs once we have already reached the tipping point is unlikely. Either way, those who own serve no purpose to the AI either, they will eventually be totally superfluous to the same process of optimization they themselves initiated.

Expand full comment

More prosaically, think carriage horses post automobile: marginal economic product less than cost of upkeep = massive drop in horse population.

Expand full comment

"The powerful are more powerful than they have ever been in human history, and doing anything about this state of affairs once we have already reached the tipping point is unlikely. Either way, those who own serve no purpose to the AI either, they will eventually be totally superfluous to the same process of optimization they themselves initiated. "

Again, I agreed that there could be greater inequality.

But I don't understand the argument for why we should expect that the most powerful humans will lose control.

The currently powerful humans are often superfluous from a labor standpoint, but still retain power and control.

Expand full comment

I’m working off the assumption that the rich people left over will have made their models as mechanistically avaricious as themselves, especially since they will continue competing with one another long after most people are out of the mix. This doesn’t seem very conducive to human - model coexistence. Provided these people figure out how to stop maximizing output at the last moment I don’t think it’s impossible that they would survive, though.

Expand full comment

Either they take decisions that are contrary to their ASI advisors. Then they get outcompeted.

Or they just rubber-stamp decisions taken by their ASI advisors, and they are just… rubber-stampers. Rubber-stampers are not the one actually yielding power and control.

Expand full comment

Yeah the labor argument wouldn't apply to owners. Seems the argument there is about competitive pressure.

To stay competitive they would have to eventually outsource all management of their assets (eg of their companies) to AIs much smarter and faster than themselves. And among those the ones most purely focused on economic and power competition would outcompete the others and eventually end up owning nearly all resources. And these resources would have to be used almost entirely for further competition.

Maybe they would be able to retain enough surplus, while staying competitive, to give their owners a great quality of life though? Especially since the world will be vastly richer and most human desires are pretty cheap to fulfill.

Expand full comment

So the owners wouldn't be in *control* any more, in the sense that they aren't making any informed decisions about how their companies are run. Rather the companies (and societal institutions) would be fully controlled by AIs acting on the goals the owners had originally given them.

But yeah, seems they could still have a lot of wealth provided to them by their AIs. That is, as long as the AIs are sufficiently aligned, and they have peovided them with good enough goals (as opposed to simple goals like eg just ourely maximize profit)

Expand full comment

"quality of life could still be great for those on the bottom"

Quality of life is never good for those on the bottom, because people perceive their QoL only relative to the people around them, not in terms of any absolute scale.

By your logic, people living 1000 years ago would say that those on the bottom today have great quality of life. But if you ask those people on the bottom today, they certainly would not agree.

Expand full comment

... Unless you ask them to compare themselves to said people from one thousand years ago, with detailed examples.

Expand full comment

Empirically, that helps people feel better about their situation for approximately 5 minutes, and then they go back to the normal human way of being, which is to compare your situation to the situation of those around you when assessing how your life is going.

Which of course makes 100% sense from an evolutionary point of view. Any group that achieved a specific level of QoL and then stopped would get out-competed by another group that has a hedonic treadmill in their brains. This hedonic treadmill is deeply, deeply ingrained, for good reason.

Another counterargument is: our GDP/capita has increased 100x over the last 2000 years. By your logic, we should be 100x happier. (or maybe it's non-linear. Let's say we're 10x happier.)

Man, people living 2000 years ago must've been really fucking depressed all the damn time, no??

Expand full comment

Sure people feel somewhat bad for having lower relative status and power. But that's not everything, or even most of quality of life. And they could still have an absolute QoL far beyond billionaires today, eg in terms of what experiences they can have and how healthy they can be both physically and mentally.

Expand full comment

You would think that this absolute QoL would matter, yes. But does it matter to subjective well-being? I think it does not. And isn't subjective well-being literally the only thing that matters? I think so. If not, then basically you're telling people: "You think you are unhappy, but actually if you look at the data, you're not unhappy."

Expand full comment

You would think that this absolute QoL would matter, yes. But does it matter to this person's subjective well-being? I think it does not. And isn't subjective well-being literally the only thing that matters? I think so. If not, then basically you're telling people: "You think you are unhappy, but actually if you look at the data, you're not unhappy."

Expand full comment

What are your thoughts on augmentation as a survival strategy? Of using some Neuralink-esque interface to “plug” into the greater intelligence? Basically, if you can’t beat em, join em?

Expand full comment

> What are your thoughts on augmentation as a survival strategy? Of using some Neuralink-esque interface to “plug” into the greater intelligence? Basically, if you can’t beat em, join em?

We're so far behind on these fronts we would need very strong AGI / ASI advancing science and technology on a number of fronts to have it even become plausible and effective.

And then, because it was designed by much-smarter minds, if those minds aren't fully aligned, there could be backdoors, hard limits, or other vulnarabilities such that even modified humans aren't competitive or relevant in Economics 2.0, or in even mildly adversarial environments.

Expand full comment

The pessimism seems unwarranted. Suppose European regulators do their usual dumb thing and just make AI basically illegal in Europe, half on accident. The rest of the world watches and notices that life in Europe is going basically fine for the people there, without AI. Trump decides he’s mad about AI for some reason and he bans it here. Xi Jinping does the same for China.

That’s basically it, right? We win for the foreseeable future? All it takes is for European tech regulators, Trump, and Xi to each individually make an economically non-optimal decision. That doesn’t strike me as implausible.

Expand full comment

Are you speaking of a situation when AGI is created and then banned, or where AI is banned worldwide in the next year ?

For the first one, it depends on how commodified it had been before the ban, and whether you can realistically enforce it. If it’s llama/deepseek levels of commodification, it’s still game over. If it’s just one company creating an AGI and everyone freaking out and forcing them to delete the weights, you have a shot for 10-20 more years before getting in trouble again.

For the second one, it gets you, what ? 10-20 years before hardware improvements means any crackpot can replicate R1 in his basement ?

I won’t spit on a 10-20 years respite, to be clear, but no, it’s not "it".

Expand full comment

This is in a sense the scenario outlined in Dan Simmons’s Hyperion Cantos.

Expand full comment

Zvi, maybe I missed a clarification in here but is this all operating under the idea of an ASI not an AGI? My understanding is that we are somewhat rapidly approaching something that could be considered AGI (maybe only decades away) but I don't really understand how that moves into what you're describing as ASI? Do you see these kind of threats forming under what we have all been calling AGI?

Expand full comment

If we somehow stall out at AGI but not ASI (seems unlikely/hard/weird?) then we face lesser versions of these issues depending on exactly what we do or don't have.

Expand full comment

Essentially the point is that the most likely outcome of achieving AGI is an ASI so the current goals of AI development always lead in this direction. Makes sense thanks.

Expand full comment