AI #127: Continued Claude Code Complications

Jul 31

Due to Continued Claude Code Complications, we can report Unlimited Usage Ultimately Unsustainable.

21 Comments

> This raises the question of why we are unable to, in serious AI chatbot uses, close or edit conversations the way you can with AI companion apps.

Can you clarify the difference vs. current behavior? Gemini, Claude, and ChatGPT all allow you to edit any previous prompt, which "restarts" the conversation from that prompt, and ignores anything that came after it.

I agree cloning would be nice, so you could explore multiple branches without losing state.

Expand full comment

Reply (1)

loonloozook

Aug 5

In Gemini (gemini.google.com) I can only edit the last one, it appears.

Expand full comment

Brandon Reinhart

Jul 31

One of my favorite things with claude code is when I get a "paying rent" moment where it clearly does something that returns more than my $200 a month in value, instantly. Yesterday, it one-shot a tile culling algorithm in a compute shader, integrating it into my render graph lighting system and providing read-back debugging tools. It's fast as hell and worked the first time. I was floored.

These kinds of moments make up for the Penelope Problem where the AI will get a decent way into some good work, suddenly assert "I'm thinking about this all wrong!" and then proceeds to delete its work or the Hydra Problem where, in attempting to slay one bug it gives rise to two more. Or Code Amnesia, when the reasons why the agent made a good decision in the first place leaves the context window... Or...

Expand full comment

Reply (1)

Soarin' Søren Kierkegaard

Jul 31

I’m just a small time hobbyist indie game dev but having AI write my shaders has been such a game changer.

Expand full comment

Askwho Casts AI

Jul 31

Podcast episode for this post: https://open.substack.com/pub/dwatvpodcast/p/ai-127-continued-claude-code-complications?r=67y1h&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true

Expand full comment

SOMEONE

Jul 31Edited

Not sure I truly agree there has been little progress in 2025. The big US players all have released one major release, with another one likely to happen in August or September, for each of them. But fair enough, if those do not move the yardstick it will be a strong sign.

The more interesting part is really that the Chinese players are firing on all cylinders. The may not have entirely caught up with the frontier but the gap is for sure small now (qwen, glm, Deepseek 0528 are all within 4 points of o3 and Gemini on artificialanalysis).

Possibly yet more interesting are the medium sized models - GLM 4.5 Air is seriously impressive and as 109b12a can be run on 2000$ off the shelf hardware (either Macbook or AMD 395) whereas qwen 30b3a can run on any half way current 32GB RAM laptop at acceptable to very decent speed.

Expand full comment

MichaeL Roe

Jul 31

One way in which the liking owls result is bad is that it implies two instances of the same model can communicate with each other steganographically.

Suppose you have a conversation with a model and publish the some of result to the Internet. Even if you check that what you’re releasing is something you’re happy to release, you don’t know what else from the conversation might be being leaked by steganography.

I had always assumed this might be possible, but we now have confirmation the steganographic channel exists and is usable.

Expand full comment

Reply (1)

MichaeL Roe

Jul 31

It turns out that almost no-one runs their AI in a sandbox.

But suppose you were doing that. A theoretical attack on the AI containment problem is that a superintelligent AI might use steganography to leak information out of the sandbox past the warden.

And so, the question arises: how realistic an attack is that?

The bad news: the way that LLMs work means you get steganography for free, and even models that aren’t yet superintelligent can carry out the attack.

Expand full comment

Laszlo

Jul 31

copyediting: "humans can of writing insecure terrible code on their own" should be "humans can of course write insecure terrible code on their own"

Expand full comment

Garrett MacDonald

Jul 31

Regarding the poll, AI has made me very suspicious of anything not written by someone I was already reading pre-AI. This makes it hard to get into new authors who weren't writing pre-AI and I can tie their style back to

Expand full comment

jmtpr

Jul 31

Zvi, given that you think we're careening towards disaster, I'm always surprised when you don't support arbitrary market frictions such as breaking up Nvidia or other tech giants. Of course you don't agree with that as economic policy, but it would nevertheless slow the timeline which seems like the much more important thing? The same argument applies to energy production, shouldn't we be glad that we're not building enough of it to go full speed ahead? We'll get there eventually, at a slower pace, and that's a good thing right?

Expand full comment

Matt Wigdahl

Jul 31

Instead of AI labs monetizing via advertising, maybe they should go the other way around. Require free tier users to generate some sort of new human content or review some RLHF results or something, CAPTCHA-style, to provide value and power their use cases.

Expand full comment

Fergus Argyll

Jul 31

Ads are the only way to give free users your best stuff

Expand full comment

gregvp

Jul 31

The wise person sees us handing over our decision making to AIs when the AIs are visibly incompetent .. and realises this is good, because the way that humans learn is by having bad experiences.

Small bad experiences now are likely to prevent, or reduce the number of, large bad experiences later.

Expand full comment

avalancheGenesis

Aug 1

One thing I've noticed re: AIStack virality is that...going outside my usual four ratsphere subscription blags, where a Really Good Post might net 300+ likes, one regularly encounters these fairly-random nobody writers with these long-ass tendentious yawn-posts with like 1000+ likes or whatever. And I'm like, wtf, how, why? These aren't even necessarily orange checkmark authors who actually have sizeable subscriber bases, just writers who happen to "go viral" with zeitgeist-pilled posts. Except there's no there there, and I just can't believe a majority of people clicking the heart button actually read the whole thing. (And/or they have shitty taste.) It's kind of disturbing, honestly. Undermines the entire business case for Substack, which was to get away from low-denominator writing banking on appeals to authority that dominates so much of the MSM. But I guess that's what a lot of people actually want, when the revealed preference chips are down? Victims of our own success...

(And likewise, "bro ChatGPT wrote this" is quickly turning into the new writerly insult du jour, which is kind of amusing when Real Human Writing is juxtaposed to those gigantic slop like counts. RLHF sycophancy strikes again!)

Expand full comment

Reply (1)

Victualis

Aug 2

I don't think sycophancy is necessary here. It has become clear how little effort most people put into developing writing skills, because now the comparison point is Llama 3.1 at a minimum, rather than a horde of monkeys with typewriters. Slop might be overwrought and predictable and yawn-inducing but it is now reliably better written than the mean Substack post, so it's likely to take over for those uses.

Expand full comment

Mark Schröder

Aug 1

I asked Gemini and Opus „What is an unspoken rule that everyone in AI safety knows?“

Gemini:

The unspoken rule is: Assume you only get one chance.

Opus:

The big one: everyone knows AGI is probably coming much sooner than their public timelines suggest, but calibrated doom-saying became professionally costly after the field went mainstream.

Expand full comment

Mikhail Samin

Aug 4

> My guess is that over a 5 year time horizon, in the worlds in which we do not see AGI or other dramatic AI progress over that time, this is mostly accurate. Those using AI now will mostly net benefit, those refusing to use AI now will mostly be harmed

I don’t think this is necessarily the causality. E.g., maybe people who write about literature are better than AIs for now, and AI is not yet helpful; but they might expect that in five years, AI will be able to do all of what they’re doing.

Expand full comment