36 Comments
User's avatar
Jared's avatar

You open with a very good point about Gary, except for the fact that he also makes prospective claims that you largely sidestep here to discuss primarily what he has said about ai development so far.

There are plenty of valid critiques to make about Gary’s article, but there is no need to misrepresent what he says

Expand full comment
Sam's avatar
Sep 9Edited

Could you write a bit more about the prospective claims that you felt were missed?

Expand full comment
Miles Shuman's avatar

Minor point, but I encourage everyone to try this: take some number (N > ~8) of small objects, scatter them on your floor or table, snap a picture, and give it to GPT 5 Thinking. Ask it to count how many [whatever] there are in the picture.

I’ve tried multiple versions of this, without success. (Though Gemini is actually pretty good!)

So at least in one sense, Gary is right about it not being able to count.

Expand full comment
Stefan Kelly's avatar

if it interests people i wrote about the psychological side of goalpost shifting this week too https://alreadyhappened.xyz/p/how-to-move-two-goalposts-at-the

Expand full comment
Miles Shuman's avatar

Assume that there’s one architectural or algorithmic advance waiting to be discovered by a research team at a frontier lab, significant enough in impact to make that team believe it now has a major advantage in the race to AGI / ASI.

What’s the game theory (loosely defined), once the discovery is made, about when to make public *just the fact that an impactful discovery has been made*. (I think the answer may depend on a lot of factors, not asking for any kind of one-size-fits-all answer.)

If, under many conditions, the answer is some variant of “keep it very close to your chest until you’ve leveraged your advantage”, how should that impact everyone’s estimated timelines and uncertainties for those timelines?

Expand full comment
Mark's avatar

It's worth asking whether a lab *can* keep such a development close to the chest, or whether it will near-inevitably leak.

Expand full comment
alpaca's avatar

Turns out superforecasters are just base rate bros a little better at playing reference class tennis.

I think you're a little overly dismissive about adversarial examples. Most of the economically relevant uses involve potentially adversarial interactions, ironically with the exception of things like research: tasks which are difficult but where the universe is mostly not actively trying to work against you.

This could lead to AI becoming unexpectedly good at science, and internal tasks in large enterprises (manufacturing, internal processes, etc.), while remaining relatively limited in its economic impact wherever there is a significant surface area to the outer world, including current leading applications such as coding and marketing.

In many ways, this might be the most dangerous path things could take, as it would likely cause most people not to experience advanced AI first-hand, and the class contains many of the most dangerous vectors how misalignment could screw us, such as nanotech, bio weapons, manufacturing, and so on.

Expand full comment
Miles Shuman's avatar

Interesting take … my intuition has always been that this might be the *best* shape for the universe to be, at least in the short- to medium-term: AI could accelerate science & engineering without causing mass unemployment & the consequent social chaos.

Expand full comment
Andrew's avatar

I think your views are highly compatible - that it would be less socially disruptive because the consequences would be less salient, but existentially riskier because there would be far less public support for regulation.

Expand full comment
alpaca's avatar

The reason why it's dangerous in my model is that in possible worlds where AI doesn't kill us all in the next few decades, this will be because we solved alignment somehow. Given that we are so far from solving alignment we basically don't even know where to start, if tractable at all, it requires ASI to not be developed until we have managed to do the research to solve it.

How do we pause? Given the race dynamics, the two most plausible candidates to me are public pressure (anyone putting serious resources into ASI becomes a social pariah and stripped of resources, at the very least), or AI stalling out at about human level and the problem takes the form of ASI = ASI alignment, i.e. it cannot be developed if not aligned.

The latter doesn't seem likely as the design space for intelligent agents should be much larger than the one for intelligent agents that do well by humanity in particular.

This leaves public pressure. Public pressure will only materialize if most people are directly interacting with or displaced by AI, or if we have serious near misses like a new pathogen that kills millions of people. For a "science-only" AI, only the near misses are left.

Also, AI that is good at research could perhaps develop better AI, which would only delay displacement and then kick off a foom cycle at the end, which reduces the time available for near misses. Or it could develop better weapons for humans to use to kill each other/ourselves, without much public awareness before it happens.

Expand full comment
Miles Shuman's avatar

I think probably what this boils down to for me is (1) I think researchers are far more likely than the masses to take existential risks seriously, and (2) I don’t think regulation arising from anger over AI taking human jobs is likely to halt AI development overall.

Expand full comment
alpaca's avatar

Fair enough. (1) may be true, but that doesn't necessarily stop them from keeping on researching, as current researchers at leading AI labs apparently think they are on track to ASI and while a few of them quit, mostly they then start other companies in order to race faster.

With (2) it's less about regulation and more about social dynamics. In areas where we have successfully coordinated to stay away from research, such as bio weapons and human cloning, regulation plays a part but it seems to follow public opinion (at least among elites) rather than leading the way. I think these examples indicate there is some hope for researchers to stop each other for ethical reasons, but AI could be importantly different in that it could take over its own continued development, unless researchers stop quite early. There's also already much more vested interest in ASI development than there ever was in human cloning, so it could be harder to stop in this way.

That said, if the US and China somehow got together and outlawed ASI development, Europe/UK would probably fall in line, which only leaves weaker players like India and Russia who I think would likely also come to the table even if only out of fear US/China would restart their efforts. I wouldn't expect many groups to continue clandestinely developing ASI, especially given the high resource requirements.

Not that any of this is a realistic prospect. But you might be underestimating regulation as a retarding factor. Even such relatively boring and mostly unrelated regulations (compared to an outright ban) as GDPR have a huge negative impact on AI research in Europe.

Expand full comment
Miles Shuman's avatar

Honestly think the only chance of that level of social pressure against continued AI development is a very public near-miss, accidental or engineered. The perceived upside & obvious utility is just immensely higher for powerful AI than for things like human cloning, nuclear weapons, or even nuclear power.

Expand full comment
Christian's avatar

Progress on what? On benchmarks? On graphs? Where’s the real world utility? Where are the tangible benefits and outcomes? Yes, AI continues making progress on vanity metrics.

Expand full comment
Andrew's avatar

You illustrate a great recurrent theme that people love making bombastic predictions about the future, but that when faced with the possibility of reputational ramification of being wrong, they dial back the prediction to preserve a path for retreat. Forecasting motte and bailey.

Expand full comment
Ebenezer's avatar

"What GPT-5 and other models get wrong are, again, adversarial examples that do not exist ‘in the wild’ but are crafted to pattern match well-known other riddles while having a different answer."

I think it's a mistake to claim that these adversarial examples are irrelevant for real-world performance. I still run into common-sense reasoning failures when I use AI for everyday real-world tasks, and I haven't seen much improvement on this in the past ~year.

I also think you're underrating the significance of the "GPT-5" product label from a marketing perspective. The reason people are updating, I suspect, is because it shows that this is "all OpenAI can do" for the big splashy GPT-5 release. Suggesting that there's not much on the horizon which represents a notable improvement which can be mass-commercialized. The fact that GPT-5 is on trend is somewhat besides the point.

NVIDIA's stock price has declined from about $180 to about $170 over the past month, so that appears to be a straightforward falsehood or misrepresentation on your part. Previously NVIDIA shares were on a slow, steady upward trend. The trend switched from positive to negative right around the GPT-5 release. It's not a massive trend change, but it does appear to be present.

Why does this all matter? I think people in the AI safety sphere underrate the degree to which investor sentiment is self-fulfilling. If people believe AI will be big, it gets lots of investor dollars, and therefore becomes big. If people believe the opposite, and investment dries up, it could see a crash. If you're a safetyist, hyping AI has the potential to backfire massively. I see a pattern in your posts of already having written the bottom line that AI progress will be rapid, and filling in arguments to support this position, and filtering out evidence against it. This seems... counterproductive.

If you want to achieve the best of both worlds, one strategy is to work to ensure that AI profits are very low (e.g. by setting market conditions for model commoditization), and announce to investors that AI will be an unattractive industry to invest in even if progress continues.

I agree that even with Gary Marcus type timelines, people should be worrying much more and doing much more to prepare.

Expand full comment
JV's avatar

The day before GPT-5 release the price was $179. The day after release it was $183. Three weeks later, the day before Q2 financials it was $182. Since then there have also been e.g. announcements of various AI ASIC efforts. I wouldn't draw any GPT-5 related conclusions.

Expand full comment
deusexmachina's avatar

For a random peon like me (read: most

People), what do you suggest I do to prepare?

Expand full comment
Ebenezer's avatar

You could buy this book to help it become a bestseller:

https://ifanyonebuildsit.com/

Donating / volunteering with AI nonprofits such as PauseAI might also help.

https://forum.effectivealtruism.org/posts/s9dyyge6uLG5ScwEp/it-looks-like-there-are-some-good-funding-opportunities-in

Honestly I haven't thought a ton about this. Probably some of these actions have potential to backfire. I imagine doing some independent thought and posting online to start discussions could be high-leverage.

I don't share the common pessimism about the value of alignment research.

Expand full comment
deusexmachina's avatar

Ok, I didn't realize that's what you meant by "prepare" - sounds much more like "prevent" to me.

The term "random peon" was somewhat misleading by me. I've done a little actual work wrt public awareness raising, and was close to pivoting to work for an AI safety think tank (comms and research), which didn't pan out for logistical reasons. I also pre-ordered the book you linked to a month ago.

My question was something like: "If I think there's a good chance that AI will massively disrupt the world as we know it within a decade or two, what actions can a normal person take to increase their odds of thriving/surviving?"

My current, gut-level answer is: Overall, not much one can and should do, since everything is way too unpredictable and can go in all directions. Modal oucome, as always, is: humanity will muddle through.

But generally: Work on your social/soft skills, if you can, as they will probably be more durable than office-type hard skills. Same for personal networks. Largely stay the course on financial investments, as, again, things could go in any direction, and you're not smart enough to know better. But probably increase your savings rate if you can, to hedge against economic disruption. Make sure you are in the top 20% of skill wrt using AI in your own work, which isn't hard to get to.

Expand full comment
Sam Penrose's avatar

Much more interesting to me than Marcus' piece is Narayan and Kapoor’s working through, from the slow-diffusion perspective, how to communicate effectively with Scott Alexander[1]:

It is hard to understand one worldview when you’re committed to another. We wrote:

> AI as normal technology is a worldview that stands in contrast to the worldview of AI as impending superintelligence. Worldviews are constituted by their assumptions, vocabulary, interpretations of evidence, epistemic tools, predictions, and (possibly) values. These factors reinforce each other and form a tight bundle within each worldview.

This makes communication across worldviews hard. ... [examples] ... These communication difficulties are important to keep in mind when considering the response by Scott Alexander, one of the AI 2027 authors, to AI as Normal Technology. While we have no doubt that it is a good-faith effort at dialogue and we appreciate his putting in the time, unfortunately we feel that his response mostly talks past us. What he identifies as the cruxes of disagreement are quite different from what we consider the cruxes! For this reason, we won’t give a point-by-point response, since we will probably in turn end up talking past him in turn. But we would be happy to engage in moderated conversations, a format with which we’ve had good success and have engaged in 8-10 times over the past year. The synchronous nature makes it much easier to understand each other.

...

We are glad that Alexander’s response credits us with “putting their money where their mouth is on the possibility of mutual cooperation”. The sentiment is mutual. We look forward to continuing that cooperation, which we see as more productive than Substack rebuttals and counter-rebuttals.

[1] https://www.normaltech.ai/i/173147197/it-is-hard-to-understand-one-worldview-when-youre-committed-to-another

Expand full comment
Jeffrey Soreff's avatar

FWIW, remember that Altman made the claim at one point that GPT5 would be smarter than he is. Yes, this is different from saying that it would be AGI, and he didn't make that claim, but I still take Altman's words as meaning _something_, so when GPT5 came out, and the results from my benchmark-ette (with GPT5-Thinking) were basically indistinguishable from o3, I was disappointed, so I'm lengthening my timeline guess. Still, the hallucination benchmark improved by a factor of 2 vs o3, so it wasn't _no_ progress.

Expand full comment
Richard Meadows's avatar

Why do you appeal to the stock market as evidence in your favour, but at other times when the market disagrees with you (as below) you discount it as 'bonkers'? You can't have it both ways.

> The market is failing to price it in because the market is failing to price it in. But also that is a real answer. It is an explicit rejection, which I share, of the EMH in spots like this. Yes, we have enough information to say the market is being bonkers. And yes, we know why we are able to make this trade, the market is bonkers because society is asleep at the wheel on this, and the market is made up of those people.

Expand full comment
Methos5000's avatar

When the Venn diagram for what qualifies as evidence and what supports the conclusion you want to come to is a circle you can. Hubris is one of many things that lead to blindspots.

Expand full comment
Michael Lipman's avatar

Does anyone know of good manifold markets (or metacalculus, polymarket, etc) that get at this question?

Something like will economically transformative AGI exist by the start of 2027 or 2028?

There are obviously lots of ways to define that. I have in mind something like US productivity (GDP/human labor hour) going from ~1% increase per year to like 5% or 10% increase because AI is doing substantial proportions of white collar jobs that humans used to do. (Resolving this question is not dependent on what those humans end up doing).

Expand full comment
Coagulopath's avatar

GPT-5 was a disappointment to me because it indicates the main thing driving recent AI progress (test-time reasoning) may be slowing down.

o3 was considerably better than o1 after 3 months, and people at the time (OA's Noam Brown) said that rapid gains were expected to continue. That seems to not be happening.

Regarding GPT-4 and GPT-5, it's difficult to compare them because so much has changed. But GPT-4 was important because 1) it was vastly better than every other AI model at the time of its release, not just GPT-3, and 2) it marked the point where AI was finally useful for big-boy intellectual work. GPT-3 was mainly a toy used for creative writing (where hallucinations and strangeness can be useful). GPT-4 was actually pretty capable (though a human was needed in the loop). I remember people using it to code 3D games and such. Even in 2025 I was still occasionally impressed by GPT-4's world knowledge (it was able to answer weird questions about videogames that GPT4-o failed at).

Expand full comment
Roger Ison's avatar

The proposition that because AIs can learn to code, they will then improve themselves at an exponential rate is rather suspect to me. It seems more likely that dramatic improvement will require either architectural improvements (e.g. single-celled organisms had to become multicellular to overcome energy distribution constraints), or a completely new kind of hardware platform. Or both. Every architecture has its limits. It seems very hard to forecast such breakthroughs, and I'm not at all sure the necessary inventions will come from AI.

Expand full comment
JV's avatar

I think discussing timelines and predictions in general is pointless without agreed upon definitions and objective measurements. If A says AGI in 4 year and B in 8 years, it could be they disagree about the timeline or the point which counts as AGI or more likely both. There's also quibbling about whether it should be "buildable", internal, released or diffused through the economy by the given year.

Even seemingly objective predictions like the 90% AI written code mentioned in the article are not well defined. If it's LOC, AI generated code is currently much more verbose than what I write and concentrated more in unit tests and boilerplate. Does tab completion make the line AI generated? Or is it only about coding agents operating independently? Etc.

Expand full comment