39 Comments
User's avatar
Randomstringofcharacters's avatar

> The problem is, you can create all the software you like, they still have to use it.

I think also the gap between "good enough for me to use for my own stuff and vibe patch it as I go on" and "good enough to roll out to a thousand low info users and have it reliably work". Is surprisingly large

Eskimo1's avatar

Does Zvi support a pause yet?

Deadpan Troglodytes's avatar

I would like to read about a concrete instance of recursive self-improvement in which a model is enlisted to work as an agent towards a broad goal, like "reduce training time", "improve response speed", or "reduce hardware requirements", not as a mere tool to amplify human effort

I suspect it's possible, but everyone telling us it's happening right now, including Dean Ball, are using phrases that make things less clear, like "x% of Opus 4.6 was written by Opus 4.5". What?

Ball gets a little closer when he writes "AI research and engineering is substantively composed of work like: finding optimizations in various complex software systems; designing and testing experiments for AI model training and posttraining; and creating software interfaces to expose AI model capabilities to users. [...] a great deal of this work is essentially reducible to the engineering of software." I don't want to move any goalposts here, but I also feel like there's a motte and bailey afoot. "Recursive self-improvement" should be scary, or at least have scary implications, which those do not.

Mark's avatar

Seems to me that many writers have discussed this point. First the AI writes code for you, then it brainstorms research ideas for you, then it examines them one by one for you, then it develops the "taste" to judge from the beginning which research ideas are worth pursuing - or some sequence like that. Yes only one of these stages has come to pass so far, but why assume that AI will never reach the others? The first stage has come far faster than almost anyone predicted.

And even if AI is nothing more than a tool to amplify human effort - what happens if it drastically speeds up human effort? Perhaps humans were going to develop AGI on their own 50 years from now - what happens if AI lets them do this 50 weeks from now?

Deadpan Troglodytes's avatar

I don't assume that! I accept that the hypothetical progression you described is possible. In fact, I think we're already deep in stage 2.

The salient feature of recursive self-improvement, the thing that is distinctive from using it as a tool in pursuit of AGI is the prospect of the models initiating changes without humans in the loop. When people say "recursive self-improvement is happening now", I expect that to mean something closer to that than the progression you described.

Jeffrey Soreff's avatar

Ok, but "the prospect of the models initiating changes without humans in the loop" isn't a binary choice. There are degrees of oversight, and I expect the oversight to more-or-less continuously decrease as the volume of work the AIs do increases. Even now, I'd expect that AI-generated code is probably getting lightly reviewed (except for the automated checks, which aren't gated by human effort).

I tend to see the "designing and testing experiments for AI model training and posttraining;" as more of a watershed, since these actions would no longer be following a spec (or some interactions with humans sort-of equivalent to one) but rather adding novel innovations (which I'm taking as implied by designing the experiments).

Overall, I think all of the transitions will be fairly fuzzy, but smoothly moving towards essentially no human oversight, and, given the pace of AI progress, possibly only taking two or three years.

Deadpan Troglodytes's avatar

"Not a binary choice" is a fair criticism, and I should have been clearer that the 𝘥𝘦𝘨𝘳𝘦𝘦 of human direction is the critical factor when asking the question "is this self-improvement?"

It's very difficult to identify the point at which "just saving time" becomes qualitatively different. I don't think anyone would argue that generative AI helping a researcher format a report counts as self-improvement, though it's on the same continuum as activities that would count.

One highly abstract way to think about it is to say that a system exhibits recursive self-improvement if improvements to its performance increase its rate of future improvement without proportional human intervention (critical piece there). Is the human input per unit improvement declining toward zero?

But that's still too vague, and there are really at least three axes of human involvement:

1. who sets the objective?

2. who proposes the change?

3. who authorizes deployment?

Maybe an extremely simplified self-improvement ladder would look something like:

1. Saving researcher time.

2. Optimizing within a fixed spec.

3. Rewriting the spec.

4. Rewriting the process that generates the spec.

That's why I'm hungry for specifics. "Designing and testing experiments for AI model training and posttraining" covers an enormous range of possibilities. It could be at any of those levels.

Jeffrey Soreff's avatar

Many Thanks! Great points! I agree that there are multiple axes. Simplifying RSI to a scalar measure is probably an oversimplification. In some respects

2. Optimizing within a fixed spec

has, to a degree, been happening even before LLMs existed. An optimizing compiler which, for instance, chooses register allocation, or pulls some unchanging computations out of a loop, both without human oversight, was already in this category, albeit in a very limited way. Now, an _impressive_ version of this would be e.g. if an AI system comes up with something like the fast fourier transform on its own...

Re (3) and (4):

a) Adding a "3. Rewriting the spec" stage is itself modifying how the final spec is produced, so, in that sense, I think it bleeds into (4).

b) As you said, this is an approximate ladder, and many of the stages are wide, with lots of grey areas. I've been in situations where cleaning up old code required abandoning strict backwards compatibility where we knew the users were very unlikely to care about the change in behavior on certain rare edge cases. I view this as a mild version of rewriting the spec. This can shade all the way to massively reweighting the objectives of a system, perhaps discarding some entirely. If, e.g. an AI got to unilaterally change the weighting of honest/helpful/harmless completely on its own for the training of the next release, I'd find that disquieting...

Victualis's avatar

All of those stages have already happened in my experience, but not everywhere. I'm spending my time directing coding agents in the areas where they can't yet do a good job, because there is no point doing something better done by AI.

Amicus's avatar

I had Claude write itself a static analysis tool for a niche language recently: not quite a full-featured language server, but pretty close. It took about 8 hours, and my only intervention was to set up the initial loop. It's not great, but it's more than good enough to be useful - the main issue is getting Claude to remember that it isn't actually fluent in the relevant language and use its tools more often.

Deadpan Troglodytes's avatar

Thank you, that's an excellent, impressive example of unsupervised optimization.

However, I don't think this shows a model recursively self-improving. It only improves the agent framework running the model, not the model itself.

Maybe that's the wrong way to think about it. Perhaps the path to AGI involves adding layers to models that augment their abilities. But even then, this particular set optimizations seems to be a dead end.

Grigori avramidi's avatar

The point of a political attack ad is not to make people better about your candidate, it is to make them feel worse about the other guy. And openai is more exposed to consumers than anthropic.

Joanny Raby's avatar

about "Language Models Don’t Offer Mundane Utility" (medical advice paper), what's interesting/funny about the study is that the humans were the failure point, mostly. I'm unsure whether new models would be better at holding their hands:

"Despite LLMs alone having high proficiency in the task, the combination of LLMs and human users was no better than the control group in assessing clinical acuity and worse at identifying relevant conditions"

Mark's avatar

Re Asimov poll: I'm surprised that as many as ~40% of your readers report to "working in AI in some capacity" - that is huge.

TR02's avatar

Maybe a bunch of respondents misinterpreted it as "work *with* AI"? So they don't develop frontier generative models, but they do train machine classifiers, or they routinely use Claude Code at work, and they took that as enough to answer "yes"?

Either that or massive responder bias.

Victualis's avatar

Would it really be surprising if the most thorough weekly general coverage of AI were to draw much of its readership from those working in AI?

Kevin's avatar

Advertising is tricky, because I personally prefer to pay for the ad-free version of pretty much any product. But it clearly provides a lot of value to the world, both in all of the free products that people who can't afford a subscription get to use, and also because new businesses need a way to contact potential customers.

I think the world would be better off if people generally have access to a free, ad-supported version of the best AI tools out there. This goes for both the chat interface and the coding interface. I feel a little bad when I can afford a $200/month subscription and someone else can't and therefore I get to have more knowledge than them, and I get to build software faster than they can.

So I am actually hoping that ads will work. Obviously they corrupt some of the incentives but it really is a good thing that e.g. everyone got to use the same Google search in the pre-AI era.

Imperu's avatar

> It’s quoted in full here since no one ever clicks links.

I generally click on links, because I like to read more text. I don't click when the link is to x.com or youtube.

vectro's avatar

> AI services can mitigate this a lot by offering a robust instant deletion option

Note that Gemini already does have this, it is called "temporary chats". https://support.google.com/gemini?p=temp_chats

vectro's avatar

> The article I saw this in mentioned the last tech company to do this was Motorola.

Note that Motorola is still paying coupons on its century bond! Whereas if you had bought the equity I think you would be quite sad about that choice today.

Odd anon's avatar

> Matt Yglesias thinks out loud about AI consciousness

This piece is from 2022?

Jeffrey Soreff's avatar

"I wouldn’t call it faking continual learning. If it works it’s continual learning."

<mildSnark>

Is this the AI systems version of Searles's "Chinese Room" thought experiment? :-)

</mildSnark>

More seriously - there is a question about whether the generalization done in training neural net weights has an analog (better? worse?) in non-weight storage approaches.

Anthony Bailey's avatar

Second story seems a bad take, probably want to fix.

You've blamed the academic. But tey published publicly to arxiv in April 2024 and submitted to Nature the next month.

They did fine. Blame only the downstream system.

icarus91's avatar

> You are allowed to have a GPT-5.21.

Not unless you first release GPT-5.3, GPT-5.4, ..., GPT-5.19, GPT-5.20. But GPT-5.2.1 is allowed.