I predict that major disillusion with LLMs will occur this year and if they repeat this survey next year, all of the timelines will get MUCH longer. None of the developments in the past 12 months has moved the needle on the fundamental weaknesses of LLMs. All we have is a bunch of workarounds.
To me these surveys having such inconsistent answers imply that the respondents aren’t really taking it seriously. It’s more an emotional affiliation, whether you feel like part of the “technology should be regulated to prevent oppression” or “AI is basically good” or “AI x risk is real” camps.
May depend on your definition of taking it seriously. I'd suspect most would say they were taking it seriously in terms of sentiment, but did not spend so much as 30 seconds on any given question, let alone 5 minutes of careful reflection.
Which more or less leads to me agreeing with you, that it's significantly vibes-based.
> However, respondents did not generally think that it is more valuable to work on the alignment problem today than other problems.
One possible reason: The researchers may believe that alignment is intractable, either for now, or intrinsically so.
Successful researchers routinely decide that some fundamental problems are just too big to tackle. Like, sure, the Millennium Problems are all major, well-known problems, and some of them are worth a Nobel Prize. But if you chase them, then you won't have anything to show 3 years from now, and nobody will renew your funding.
There are ways to fund these problems: You can throw lots of money at promising people, and expect nearly all of them to go nowhere. Or you can offer incremental prizes.
On the one hand, I love surveys. They are leagues better than the "AI people think X" claims that fly across Twitter.
On the other hand, people who take these surveys often do so quickly and without much thought. E.g., they think AI winning at poker is in the future, not the past. E.g., they think an AI that can beat all humans at all tasks would not already be vastly better than a human. Plus we know that people tend to overrate low probabilities, time and time again: https://www.tedsanders.com/do-not-expect-the-unexpected/
I think the most interesting thing to come from these surveys is to look back and them in 20 years and see how wildly wrong the typical AI researcher was back in 2023.
In retrospect, that may be inherent to the survey format. There's no incentive to answer accurately, so people just answer off their first impulse without sanity-checking.
EDIT: Wonder if it's worth doing some sort of prediction market survey, literally just give all respondents $100 or whatever to allocate among the questions as they see fit.
>> a summary of Russell’s argument—which claims that with advanced AI, “you get exactly what you ask for, not what you want”
Isn't there a fun and sensible analogy to make, with the genie in the bottle parable, leading to the cherished "be careful what you wish for" realization?
Is this so obvious it has been done but I haven't seen it because I'm a noob or would it be worthwhile to delve deeper into it?
To me, taking that analogy could explain the vast variance of expectations. (And could help trigger some sensible intuition pumps of people who are not that much into AI?)
Apart from exploring the dark regions of human motives that are cast into wishes... Just one of the considerations:
Outcomes of wishes vary a lot based on the alignment of the genie with it's current master.
The genie sometimes 'plays dumb' and interprets the wish literally, making the outcome not what the wisher had hoped for.
If the genie was better aligned, instead of wanting to get through the 3 wishes in the fastest way possible while (wilfully or unwillingly) interpreting it's master's wishes in such a way that it's situation is not transformed to the better, but rather to worse... Wouldn't that be great?
Except we don't know how to do that?
Surely the AI would understand it's master well.
But would that make it more aligned or prone to active deception?
...
It seems the parallels are very strong. And it's a fun way to explore it. (I even wonder if the analogy of AI comes from the genie story intuition?)
Oops, seems the picture of the genie is already quite common and old, so... I'd have to check which part of the analogy isn't already commonplace if I were to ride the analogy further...
Hey Zvi, wondering if you'd be up to another guest post? A topic I'd be super interested in if it aligns with you is
"Is a Generative A.I. Winter Likely in 2024"
With so much of the funds going into Generative AI from BigTech itself and considering how Open, Anthropic, Inflection and other major startups are basically under a Cloud Fiefdom, what are the chances that all the momentum of 2023 sputters out in a less than stellar environment for startups, Venture Capital, open-source and some sort of plateauing of what LLMs can actually do post GPT-4.5
Seems worth noting that the FAOL question and the HLMI question were answered by entirely different sets of respondents - in a way you could see it as just an extremely stark illustration of the framing effect.
I predict that major disillusion with LLMs will occur this year and if they repeat this survey next year, all of the timelines will get MUCH longer. None of the developments in the past 12 months has moved the needle on the fundamental weaknesses of LLMs. All we have is a bunch of workarounds.
From your mouth to God’s ears
I think it is a concern that even if AI never advanced at all, it could already be transformative from its 2023 state.
To me these surveys having such inconsistent answers imply that the respondents aren’t really taking it seriously. It’s more an emotional affiliation, whether you feel like part of the “technology should be regulated to prevent oppression” or “AI is basically good” or “AI x risk is real” camps.
Its odd that anyone could think of x-risk of the equivalent of a nuke that can build nukes isnt real.
May depend on your definition of taking it seriously. I'd suspect most would say they were taking it seriously in terms of sentiment, but did not spend so much as 30 seconds on any given question, let alone 5 minutes of careful reflection.
Which more or less leads to me agreeing with you, that it's significantly vibes-based.
> However, respondents did not generally think that it is more valuable to work on the alignment problem today than other problems.
One possible reason: The researchers may believe that alignment is intractable, either for now, or intrinsically so.
Successful researchers routinely decide that some fundamental problems are just too big to tackle. Like, sure, the Millennium Problems are all major, well-known problems, and some of them are worth a Nobel Prize. But if you chase them, then you won't have anything to show 3 years from now, and nobody will renew your funding.
There are ways to fund these problems: You can throw lots of money at promising people, and expect nearly all of them to go nowhere. Or you can offer incremental prizes.
On the one hand, I love surveys. They are leagues better than the "AI people think X" claims that fly across Twitter.
On the other hand, people who take these surveys often do so quickly and without much thought. E.g., they think AI winning at poker is in the future, not the past. E.g., they think an AI that can beat all humans at all tasks would not already be vastly better than a human. Plus we know that people tend to overrate low probabilities, time and time again: https://www.tedsanders.com/do-not-expect-the-unexpected/
I think the most interesting thing to come from these surveys is to look back and them in 20 years and see how wildly wrong the typical AI researcher was back in 2023.
In retrospect, that may be inherent to the survey format. There's no incentive to answer accurately, so people just answer off their first impulse without sanity-checking.
EDIT: Wonder if it's worth doing some sort of prediction market survey, literally just give all respondents $100 or whatever to allocate among the questions as they see fit.
>> a summary of Russell’s argument—which claims that with advanced AI, “you get exactly what you ask for, not what you want”
Isn't there a fun and sensible analogy to make, with the genie in the bottle parable, leading to the cherished "be careful what you wish for" realization?
Is this so obvious it has been done but I haven't seen it because I'm a noob or would it be worthwhile to delve deeper into it?
To me, taking that analogy could explain the vast variance of expectations. (And could help trigger some sensible intuition pumps of people who are not that much into AI?)
Apart from exploring the dark regions of human motives that are cast into wishes... Just one of the considerations:
Outcomes of wishes vary a lot based on the alignment of the genie with it's current master.
The genie sometimes 'plays dumb' and interprets the wish literally, making the outcome not what the wisher had hoped for.
If the genie was better aligned, instead of wanting to get through the 3 wishes in the fastest way possible while (wilfully or unwillingly) interpreting it's master's wishes in such a way that it's situation is not transformed to the better, but rather to worse... Wouldn't that be great?
Except we don't know how to do that?
Surely the AI would understand it's master well.
But would that make it more aligned or prone to active deception?
...
It seems the parallels are very strong. And it's a fun way to explore it. (I even wonder if the analogy of AI comes from the genie story intuition?)
Oops, seems the picture of the genie is already quite common and old, so... I'd have to check which part of the analogy isn't already commonplace if I were to ride the analogy further...
*Monkey’s paw twitches*, surely
Hey Zvi, wondering if you'd be up to another guest post? A topic I'd be super interested in if it aligns with you is
"Is a Generative A.I. Winter Likely in 2024"
With so much of the funds going into Generative AI from BigTech itself and considering how Open, Anthropic, Inflection and other major startups are basically under a Cloud Fiefdom, what are the chances that all the momentum of 2023 sputters out in a less than stellar environment for startups, Venture Capital, open-source and some sort of plateauing of what LLMs can actually do post GPT-4.5
https://aisupremacy.substack.com/p/guest-posts-on-ai-supreamcy
I assume you have my email?
We can talk about it, let's assess how the first post did before anything else, send me the numbers and such.
Poker isn't the only milestone that's been achieved. Many are imprecise and so hard-to-evaluate, but see Patrick Levermore (https://rethinkpriorities.org/longtermism-research-notes/scoring-forecasts-from-the-2016-expert-survey-on-progress-in-ai) and Scott Alexander (https://asteriskmag.com/issues/03/through-a-glass-darkly) trying to score past predictions on the same (iirc) milestones. (Also some discussion of particular thorny milestones here: https://forum.effectivealtruism.org/posts/tCkBsT6cAw6LEKAbm/scoring-forecasts-from-the-2016-expert-survey-on-progress-in?commentId=b9hcjGXztDrraaHHH.) Milestones were included for comparability with past surveys, even if they've been achieved.
p(doom), the precise measurement of the incomprehensible.
If AI safety is to be taken seriously it needs to move beyond guesses.
Why p(doom) is currently useless.
https://www.mindprison.cc/p/pdoom-the-useless-ai-predictor
Seems worth noting that the FAOL question and the HLMI question were answered by entirely different sets of respondents - in a way you could see it as just an extremely stark illustration of the framing effect.