Dan Hendrycks, Eric Schmidt and Alexandr Wang released an extensive paper titled Superintelligence Strategy. There is also an op-ed in Time that summarizes.
Why do you want geolocation on the chip? That doesn’t make sense to me. Geolocation isn’t secure in the sense that you can use it to track where an untrusted chip is, if that’s what the author was thinking.
There are quite probably other reasons why geolocation won’t work, but I assume what is envisaged here is a distance bounding protocol, where the chip won’t work unless it gets a response to its challenge within a small number of nanoseconds, and that number is small enough that the speed of light ensures that the other end can’t be very far away.
Yeah, it does seem like there's more than one use/type of Geolocation that could be used here: E.g.
1) The chip must find/prove its current location (somehow, not via GPS as it will be indoors)
2) The chip must be within some (small) nano-second ping from some known server/location
Both of these options would require a constant "phone home" function for the chip to work normally (e.g. for training and/or inference). Which makes sense, but of course will be easily "hackable" by an ASI (or likely even AGI or human meat brains). So, a temporary solution, at best...
Whole situation looks pretty grim to me. Any newly created technology quickly becomes so integrated into society that it’s impossible to stop using it or scale it back in the slightest.
Farming? Sure we know how to stop, but we don’t have “meaningful” control over stopping, since the earth can’t support this many humans without farming.
Electricity? Yeah we know how to turn it off, but we don’t have meaningful control over it either, as it would lead to mass death and it’s impossible to coordinate.
Internet? Would lead to a -90% drawdown in stock market, collapse of the dollar, societal devastation, etc.
AI? I think we know how this is going to go. Full reliance on it, and it won’t even be realistic to not fully embrace the latest superintellence to the maximum extent, due to competitive pressure, and how many lives, dollars, economies that will be reliant on it.
In short, seems like Moloch is going to get his biggest sacrifice yet… all of humanity!
First, the nonproliferation and competitiveness concerns really predate and are broader than AI (or ASI) concerns. They really apply to any enabling technology, from iron working to CAD/CAM software. Now, a large quantitative change _is_ a qualitative change, and it well may be that AI advances produce a large speedup, or a large extension of the spectrum of possible designs. Still, the fundamental worry about an economic or military competitor getting an enabling technology or an improvement to an enabling technology and improving their products or their weapons has been a concern for centuries.
We don't _know_ how large an advantage in design time or design effectiveness AI (or ASI) will provide, in general. We've seen some striking examples (protein design speeded by orders of magnitude, some coding tasks but not yet others speeded by 10X).
Second, I _do_ see the loss of control concern as primarily specific to AI. There is a partial exception to this in that we have already ceded control to automated systems in situations like parts of process control and all of the detailed tiny decisions made by all of our existing computerized systems. Here the nomenclature gets fuzzy.
Roughly speaking, let me call an AI "AGI" if it is roughly equivalent to a human in capabilities. Delegating a decision to it is very much akin to delegating it to a human subordinate. There are principal/agent problems, but, at that level, they are ones we've had for as long as we've been human.
I lied. It isn't really plausible to have an AI equivalent to human capabilities, since existing LLMs already have a superhuman _breadth_ of knowledge. So if we get an "AGI" with more-or-less the capabilities of, say, a 115 IQ human in one area, their overall capabilities are at least weakly superhuman in the sense of breadth. Another sense in which I think we get weak ASI by default is if we populate a competent organization (e.g. SpaceX) with AGIs in each of the positions humans are in today. SpaceX, as a whole, is capable of doing feats that no single individual is capable of, and presumably the same would be true if it were built of AGIs.
Now, neither of these scenarios result in anything incomprehensible to humans. We can understand any single action of the broad AGI with 115 IQ, and we can understand what a SpaceX-o-matic is trying to do and even how it is trying to do it for any given subtask. Given success in building an AGI at all, I think these scenarios are pretty much a given.
ASI is sometimes taken to mean something that is as much smarter than humans as humans are smarters than, e.g. dogs. Now, I think that this is probable, but it is an _open_ question whether this can happen. We _don't_ have an existence proof. We don't (unlike the SpaceX case) have an architecture which we are confident will work. I'm going to call this species-jump-ASI. If we get species-jump-ASI, then, at best, we hope we set up its initial preferences so that it values humans, and we rubber-stamp its decisions, with no idea how it does what it does, and barely an idea of what or why. I see loss of control as baked in to any situation where we build species-jump-ASI.
Third, I'm really skeptical of the ability to detect an imminent jump to improved AI effectiveness. Really, the only parts of the AI development process which have a signature large enough to detect are the pretraining of frontier models and the chip fabs supplying the CPUs & GPUs used in that pretraining. But
a) Some of the major advances have been in areas like sophisticated prompting and fine tuning for reasoning, neither of which has anything like the signature of the LLM pretraining phase.
> a) Some of the major advances have been in areas like sophisticated prompting and fine tuning for reasoning, neither of which has anything like the signature of the LLM pretraining phase.
I think your points a and b here are pretty important. Remember Deepseek R1 hurdling to near "flagship model" capability with smarter techniques, distillation of an existing model, and MUCH less compute?
It's easy to imagine something just under a "concerning" threshold getting fine-tuned or distilled into a "concerning" model, or throwing some sort of "educational" framework and set of algorithms around existing data sets and getting order of magnitude jumps in capability.
Now, being able to easily imagine something is definitely not being able to easily *achieve* that outcome - but there's many reasons to think those advantages actually exist out there in "optimization space," and it's only a matter of time before somebody is able to find them. Especially as our "optimization power" keeps growing apace with each new training run and model refinement.
I actually wonder if OpenAI has whatever model they use internally that's one generation ahead of o1 Pro or o3-full already working on the "educational" framework - I know I'd have a team or two dedicated to that.
Many Thanks! Very much agreed. R1 certainly proved that far less compute could be used to create a flagship model. And I agree that there could be many places in "optimization space" that require a lot less compute than watching-for-big-data-centers policies rely on.
An "educational" framework indeed seems to be the direction that Surya Ganguli's (the professor who gave the Ted talk that I pointed to on YouTube) group is looking.
Two years ago models couldn't write a coherent paragraph. Now people plagiarize them extensively for research papers and PhD theses. That's the rate of improvement now, even before significant recursive self-improvement (AI used to develop AI). What exactly do you think another few years of improvement at this pace (or the recursive self-improvement pace) looks like?
But clocks are actually much, much simpler than a lot of other stuff it does, which shows that this is a CATEGORICAL problem, not a matter of not enough brain power. Something on the way to ASI in the near future should have understood this stuff early on.
Something genuinely intelligent should be intuitively capable of understanding clocks - specifically needing to train AIs on clocks to make them understand them would be bad enough, but that doesn't even work!
And of course, the problem isn't the clocks themselves - it's what this failure says about these models more generally. If simple things (that children can learn) aren't "intuitively" obvious to these systems, then they're necessarily very narrow intelligences and it's unclear how they could have the reasoning ability to generate ideas that are both novel and extremely practical/useful. It's like thinking an idiot savant would make a great CEO.
These models can solve math problems that 99% of people cannot (AIME etc). They can write essays on any topic that are better than 99% of people's. Anecdotally they can brainstorm original ideas better and far faster than humans. Is the AI a "narrow" intelligence because it can't do one task, or is the human a "narrow" intelligence because it can't do another task? Judging AIs by what's easy for a human seems just as silly as judging humans by what's easy for an AI. And even though current AIs are *language* models and one wouldn't necessarily expect them to have any ability at logical or mathematical tasks, in many cases they are remarkably good at those tasks too. And given the rate at which broad spectrum AI capabilities are improving, who is to say that AI won't soon master the remaining non-language tasks (like clocks) as well?
At what point in human evolution did we, or were we able to, understand clocks? How much more complex are our brains today?
I think the gap to genuine AGI (i.e. a 100 IQ human across all non-physical frontiers) is still quite big, but that's also a high bar. All that we need is ASpI (my own new acronym invented as I wrote this for Artificial Specialised Intelligence, as opposed to ASI).
I.e. what if our ASpI researchers were _far below_ humans on 75% of human abilities but equivalent to IQ 125 humans in physics, engineering, and chemistry research, and someone was employing 1500 of them to develop better [infrastructure attack vectors/missiles/satellite weapons/biological weapons]?
AI development is not analogous to human evolution. AI systems today already do far more complicated things than understanding clocks. The issue is NOT a lack of brainpower - it's a categorical issue that nobody has reason for thinking more brainpower in the short to medium term will necessarily overcome.
And I don't think it's coherent to talk about them being "IQ 125 in physics.." etc. In some ways these 'researchers' will far outstrip the ability of human researchers, in other ways they will be hopeless. If they can't intuitively grasp the concept of an analog clock, I think it's wildly overoptimistic (or pessimistic?) to think they will revolutionize research, because the best, most innovative researchers (in practical matters) aren't necessarily the best at theoretical aspects of e.g. physics, they're people able to think about things in a novel way and draw novel connections between concepts. If clocks aren't the sort of thing that AIs just get without massive amounts of specific training, it's hard to see how they could be capable of great research insight.
Not only that - the labs are currently trying to deploy ASpI researchers in the field of AI research, to create an exponential feedback loop of improved intelligence.
Yes. My hope is that we are selecting AIs for a lot of things but not for self-preservation nor for "genetic" reproduction as such - they follow a memetic not genetic pattern.
I guess one promising approach would be to enlist the CIA to assassinate leading AI capabilities researchers based in China? Since AI progress is so heavily dependent on the abilities of a small number of researchers, killing a few and forcing the remainder to retreat from normal life could be more powerful than any other intervention. Of course, US AI labs would strongly object to such a policy because China would immediately retaliate by targeting them. But that too might be for the best, if it gets all the AGI projects worldwide stuck in quicksand.
Excellent overview of the global threat from AGI or Superintelligence.
And I especially recommend reading the research papers linked.
If you’re thinking about what smaller states could and should do as the Great Digital Powers fight it out, I wrote about what Israel should do as a non-player in the game (yes Israel is very technically skilled but no, Israel can’t compete with China or the US’s Digital Empires):
I think we agree on the substance of the comparison issue, if not the terminology. At least, I feel like I was trying to illustrate the point you seem to be making: there are some things they do great on and others that they are terrible at.
I am less convinced about the conclusion: that they cannot make novel connections. My model is that making novel connections is mainly about context + IQ + curiosity, and that specific value of IQ in this specific case is what I was referring to: the ability to abstract higher-level patterns out of noisy data and match them across different domains.
I think LLMs certainly have and will always have more of the context. I think we can easily prompt them to be curious (I do hope that they cannot be independently curious!). I expect that the pattern matching part is partly innate (that is exactly what they do!) and can partly be trained for or at least prompted for.
Why do you want geolocation on the chip? That doesn’t make sense to me. Geolocation isn’t secure in the sense that you can use it to track where an untrusted chip is, if that’s what the author was thinking.
There are quite probably other reasons why geolocation won’t work, but I assume what is envisaged here is a distance bounding protocol, where the chip won’t work unless it gets a response to its challenge within a small number of nanoseconds, and that number is small enough that the speed of light ensures that the other end can’t be very far away.
Yeah, it does seem like there's more than one use/type of Geolocation that could be used here: E.g.
1) The chip must find/prove its current location (somehow, not via GPS as it will be indoors)
2) The chip must be within some (small) nano-second ping from some known server/location
Both of these options would require a constant "phone home" function for the chip to work normally (e.g. for training and/or inference). Which makes sense, but of course will be easily "hackable" by an ASI (or likely even AGI or human meat brains). So, a temporary solution, at best...
SP: "attempts to developer superintelligence" -> "attempts to develop superintelligence"
Typo under loss of control.
> rouge AI agent
Red team alert.
Whole situation looks pretty grim to me. Any newly created technology quickly becomes so integrated into society that it’s impossible to stop using it or scale it back in the slightest.
Farming? Sure we know how to stop, but we don’t have “meaningful” control over stopping, since the earth can’t support this many humans without farming.
Electricity? Yeah we know how to turn it off, but we don’t have meaningful control over it either, as it would lead to mass death and it’s impossible to coordinate.
Internet? Would lead to a -90% drawdown in stock market, collapse of the dollar, societal devastation, etc.
AI? I think we know how this is going to go. Full reliance on it, and it won’t even be realistic to not fully embrace the latest superintellence to the maximum extent, due to competitive pressure, and how many lives, dollars, economies that will be reliant on it.
In short, seems like Moloch is going to get his biggest sacrifice yet… all of humanity!
https://slatestarcodex.com/2014/07/30/meditations-on-moloch/
Three broad comments:
First, the nonproliferation and competitiveness concerns really predate and are broader than AI (or ASI) concerns. They really apply to any enabling technology, from iron working to CAD/CAM software. Now, a large quantitative change _is_ a qualitative change, and it well may be that AI advances produce a large speedup, or a large extension of the spectrum of possible designs. Still, the fundamental worry about an economic or military competitor getting an enabling technology or an improvement to an enabling technology and improving their products or their weapons has been a concern for centuries.
We don't _know_ how large an advantage in design time or design effectiveness AI (or ASI) will provide, in general. We've seen some striking examples (protein design speeded by orders of magnitude, some coding tasks but not yet others speeded by 10X).
Second, I _do_ see the loss of control concern as primarily specific to AI. There is a partial exception to this in that we have already ceded control to automated systems in situations like parts of process control and all of the detailed tiny decisions made by all of our existing computerized systems. Here the nomenclature gets fuzzy.
Roughly speaking, let me call an AI "AGI" if it is roughly equivalent to a human in capabilities. Delegating a decision to it is very much akin to delegating it to a human subordinate. There are principal/agent problems, but, at that level, they are ones we've had for as long as we've been human.
I lied. It isn't really plausible to have an AI equivalent to human capabilities, since existing LLMs already have a superhuman _breadth_ of knowledge. So if we get an "AGI" with more-or-less the capabilities of, say, a 115 IQ human in one area, their overall capabilities are at least weakly superhuman in the sense of breadth. Another sense in which I think we get weak ASI by default is if we populate a competent organization (e.g. SpaceX) with AGIs in each of the positions humans are in today. SpaceX, as a whole, is capable of doing feats that no single individual is capable of, and presumably the same would be true if it were built of AGIs.
Now, neither of these scenarios result in anything incomprehensible to humans. We can understand any single action of the broad AGI with 115 IQ, and we can understand what a SpaceX-o-matic is trying to do and even how it is trying to do it for any given subtask. Given success in building an AGI at all, I think these scenarios are pretty much a given.
ASI is sometimes taken to mean something that is as much smarter than humans as humans are smarters than, e.g. dogs. Now, I think that this is probable, but it is an _open_ question whether this can happen. We _don't_ have an existence proof. We don't (unlike the SpaceX case) have an architecture which we are confident will work. I'm going to call this species-jump-ASI. If we get species-jump-ASI, then, at best, we hope we set up its initial preferences so that it values humans, and we rubber-stamp its decisions, with no idea how it does what it does, and barely an idea of what or why. I see loss of control as baked in to any situation where we build species-jump-ASI.
Third, I'm really skeptical of the ability to detect an imminent jump to improved AI effectiveness. Really, the only parts of the AI development process which have a signature large enough to detect are the pretraining of frontier models and the chip fabs supplying the CPUs & GPUs used in that pretraining. But
a) Some of the major advances have been in areas like sophisticated prompting and fine tuning for reasoning, neither of which has anything like the signature of the LLM pretraining phase.
b) There is hope for much more data-efficient (and, presumably, compute-efficient) frontier model training. See https://www.youtube.com/watch?v=Z9VovH1OWQc&t=44s at around the 3:40 mark and https://seantrott.substack.com/p/building-inductive-biases-into-llms
> a) Some of the major advances have been in areas like sophisticated prompting and fine tuning for reasoning, neither of which has anything like the signature of the LLM pretraining phase.
> b) There is hope for much more data-efficient (and, presumably, compute-efficient) frontier model training. See https://www.youtube.com/watch?v=Z9VovH1OWQc&t=44s at around the 3:40 mark and https://seantrott.substack.com/p/building-inductive-biases-into-llms
I think your points a and b here are pretty important. Remember Deepseek R1 hurdling to near "flagship model" capability with smarter techniques, distillation of an existing model, and MUCH less compute?
It's easy to imagine something just under a "concerning" threshold getting fine-tuned or distilled into a "concerning" model, or throwing some sort of "educational" framework and set of algorithms around existing data sets and getting order of magnitude jumps in capability.
Now, being able to easily imagine something is definitely not being able to easily *achieve* that outcome - but there's many reasons to think those advantages actually exist out there in "optimization space," and it's only a matter of time before somebody is able to find them. Especially as our "optimization power" keeps growing apace with each new training run and model refinement.
I actually wonder if OpenAI has whatever model they use internally that's one generation ahead of o1 Pro or o3-full already working on the "educational" framework - I know I'd have a team or two dedicated to that.
Many Thanks! Very much agreed. R1 certainly proved that far less compute could be used to create a flagship model. And I agree that there could be many places in "optimization space" that require a lot less compute than watching-for-big-data-centers policies rely on.
An "educational" framework indeed seems to be the direction that Surya Ganguli's (the professor who gave the Ted talk that I pointed to on YouTube) group is looking.
Current models don't even understand clocks. How could superintelligence possibly be close?
Two years ago models couldn't write a coherent paragraph. Now people plagiarize them extensively for research papers and PhD theses. That's the rate of improvement now, even before significant recursive self-improvement (AI used to develop AI). What exactly do you think another few years of improvement at this pace (or the recursive self-improvement pace) looks like?
But clocks are actually much, much simpler than a lot of other stuff it does, which shows that this is a CATEGORICAL problem, not a matter of not enough brain power. Something on the way to ASI in the near future should have understood this stuff early on.
Something genuinely intelligent should be intuitively capable of understanding clocks - specifically needing to train AIs on clocks to make them understand them would be bad enough, but that doesn't even work!
And of course, the problem isn't the clocks themselves - it's what this failure says about these models more generally. If simple things (that children can learn) aren't "intuitively" obvious to these systems, then they're necessarily very narrow intelligences and it's unclear how they could have the reasoning ability to generate ideas that are both novel and extremely practical/useful. It's like thinking an idiot savant would make a great CEO.
These models can solve math problems that 99% of people cannot (AIME etc). They can write essays on any topic that are better than 99% of people's. Anecdotally they can brainstorm original ideas better and far faster than humans. Is the AI a "narrow" intelligence because it can't do one task, or is the human a "narrow" intelligence because it can't do another task? Judging AIs by what's easy for a human seems just as silly as judging humans by what's easy for an AI. And even though current AIs are *language* models and one wouldn't necessarily expect them to have any ability at logical or mathematical tasks, in many cases they are remarkably good at those tasks too. And given the rate at which broad spectrum AI capabilities are improving, who is to say that AI won't soon master the remaining non-language tasks (like clocks) as well?
At what point in human evolution did we, or were we able to, understand clocks? How much more complex are our brains today?
I think the gap to genuine AGI (i.e. a 100 IQ human across all non-physical frontiers) is still quite big, but that's also a high bar. All that we need is ASpI (my own new acronym invented as I wrote this for Artificial Specialised Intelligence, as opposed to ASI).
I.e. what if our ASpI researchers were _far below_ humans on 75% of human abilities but equivalent to IQ 125 humans in physics, engineering, and chemistry research, and someone was employing 1500 of them to develop better [infrastructure attack vectors/missiles/satellite weapons/biological weapons]?
AI development is not analogous to human evolution. AI systems today already do far more complicated things than understanding clocks. The issue is NOT a lack of brainpower - it's a categorical issue that nobody has reason for thinking more brainpower in the short to medium term will necessarily overcome.
And I don't think it's coherent to talk about them being "IQ 125 in physics.." etc. In some ways these 'researchers' will far outstrip the ability of human researchers, in other ways they will be hopeless. If they can't intuitively grasp the concept of an analog clock, I think it's wildly overoptimistic (or pessimistic?) to think they will revolutionize research, because the best, most innovative researchers (in practical matters) aren't necessarily the best at theoretical aspects of e.g. physics, they're people able to think about things in a novel way and draw novel connections between concepts. If clocks aren't the sort of thing that AIs just get without massive amounts of specific training, it's hard to see how they could be capable of great research insight.
Not only that - the labs are currently trying to deploy ASpI researchers in the field of AI research, to create an exponential feedback loop of improved intelligence.
Yes. My hope is that we are selecting AIs for a lot of things but not for self-preservation nor for "genetic" reproduction as such - they follow a memetic not genetic pattern.
And memes only survive if their hosts do.
I guess one promising approach would be to enlist the CIA to assassinate leading AI capabilities researchers based in China? Since AI progress is so heavily dependent on the abilities of a small number of researchers, killing a few and forcing the remainder to retreat from normal life could be more powerful than any other intervention. Of course, US AI labs would strongly object to such a policy because China would immediately retaliate by targeting them. But that too might be for the best, if it gets all the AGI projects worldwide stuck in quicksand.
Putting aside all other issues with this, I think you wildly overestimate the competence and power of the CIA.
Excellent overview of the global threat from AGI or Superintelligence.
And I especially recommend reading the research papers linked.
If you’re thinking about what smaller states could and should do as the Great Digital Powers fight it out, I wrote about what Israel should do as a non-player in the game (yes Israel is very technically skilled but no, Israel can’t compete with China or the US’s Digital Empires):
https://alighthouse.substack.com/p/we-cant-afford-to-ignore-the-warning
Would love to hear other thoughts as to what smaller states can do in the time we have left.
I think we agree on the substance of the comparison issue, if not the terminology. At least, I feel like I was trying to illustrate the point you seem to be making: there are some things they do great on and others that they are terrible at.
I am less convinced about the conclusion: that they cannot make novel connections. My model is that making novel connections is mainly about context + IQ + curiosity, and that specific value of IQ in this specific case is what I was referring to: the ability to abstract higher-level patterns out of noisy data and match them across different domains.
I think LLMs certainly have and will always have more of the context. I think we can easily prompt them to be curious (I do hope that they cannot be independently curious!). I expect that the pattern matching part is partly innate (that is exactly what they do!) and can partly be trained for or at least prompted for.