20 Comments
User's avatar
Alex's avatar

I like Opus 4.5 a lot, but I do find that it glazes me quite a bit. It's less over-the-top about it than 4o, but it is really eager to find good justifications for even my deliberately dumb ideas. If I want the type of pushback that a smart friend would give, then I have to explicitly prompt for it. There is still substantial work to be done here.

Re:Courses's avatar

You were a professional sports better?! Tell us more

Re:Courses's avatar

Zvi, I wanted to be productive today, and now it's been 3 hours and I've read about everything from Michael Vassar to DAOs to Secret Society Dynamics (https://traditionsofconflict.com/blog/2019/2/23/notes-on-nggwal).

IMO, one of your best posts yet

Graham Blake's avatar

Re: Footprints in the sand, I had a similar reaction to the thing, and had to do some translations from hype slop to a more grounded reality for people who went !!! upon seeing it. One of the several things I found odd about the piece was that it used such a similar analogy to Jack Clark's remarks in Berkeley - "We turn the light on we find ourselves gazing upon true creatures" - to argue that the labs are not speaking plainly about these things.

On that piece's strange comments about continual learning, even if it were solved or labs had a clear path forward on it, I think it misleads or misunderstands about why the labs would be keeping it out of deployment. My sense is that the labs would gleefully drop a continually learning model and talk themselves into its safety from an existential risk perspective, but they would be FAR more concerned about the liability exposure of it leaking the private data it is continually-learning along the way. However good automated privacy scrubbing and data anonymization may be getting these days, I doubt anyone is confident enough right now that a continually learning model will not almost immediately be very-publicly sharing sensitive user and corporate data in a way that exposes the labs to enormous public backlash and legal liability. Pliny would know everything about everyone in like 15 minutes. Obviously there are very many more Hard Problems that present themselves, but that's probably the one that gives Legal aneurysms

Lex Spoon's avatar

The reason the US can't just roll its debt over forever is because the budget is in bad shape and getting worse. Interest payments on the debt are now larger than the Defense budget.

https://www.forbes.com/sites/theapothecary/2024/02/07/cbo-federal-interest-payments-now-exceed-defense-spending/

Robert M.'s avatar

It Trump gets his increase of the defense budget to $1.5T, the defense budget will no longer be larger than the interest payments.

Jeffrey Soreff's avatar

"Where I think it is outright wrong is claiming that ‘we have solved’ continual learning. If this is true it would be news to me."

Happy New Year! I would appreciate any suggestion on how to follow progress on continual/incremental learning. I looked at the ContinualAI web site, but it seems like

- It isn't getting much in the way of updates. E.g. the latest entry in their "Latest News" is May 8, 2025

- It looks like it mostly has an academic and EU focus, but the major labs are Google DeepMind, OpenAI, and Anthropic (and maybe xAI) - I would want to watch the major labs' announcements and progress on continual/incremental learning benchmarks (to the extent these help...).

Many THanks for any suggestions!

thestage's avatar

actually the talk of the town is rampant fascism, which the tech industry is fully integrated with.

avalancheGenesis's avatar

A short AI roundup to start the year off, will miracles never cease. Do hope you'll reserve speed premium for genuinely-timely coverage (like pending legislation or Battle of the Board), but overall, yes - I think there is not so much value lost on ephemera like Model X.Y Increment Model Card and the like. Certainly the non-barking dogs like last couple of DankSeeks could be relegated to a roundup footnote. Whereas Quality evergreen, or even periodic deciduous posts (Big Nonprofits Post!), have significant value-above-replacement. Everyday life would be different without frameworks like Levels of Friction and Simulacra; I can't remember the last time I remember remembering pan-flashes like, idk, Devin or Serpens Vision Pro, except as jokes. (New children's television series: The Magic AI Schoolbus, featuring star teacher Ms. Fizzle...)

Wonder if there's a correlation between being a top-performing generalist and grokking the idea of "if there were a lot more of me, things would be radically different around here". There's a certain human impulse towards...assuming conservation of ability, I guess? That Jacks of all trades must necessarily be masters of none, that exemplars in one area must have compensating weaknesses in others. Except, no, that's not how it works...sometimes there really are humans that are simply better along ~every relevant job attribute, even factoring in jaggedness. Once you can start scaling such superlative employees...

Coagulopath's avatar

>This viral Twitter article, Footprints in the Sand

I saw that, and it produced the hardest "happy for you, I ain't reading that" reaction of my life. If you ever find yourself reading "There's a particular kind of dread that comes from..." don't go any further: it's slop.

(I'm guessing it ends with the shock twist that the article itself was AI-written "...and you didn't even notice." Believe me, I very much did.)

I mentally associate @iruletheworldmo with Jimmy Apples and so forth...a network of highly unreliable "insiders" with records of falsified predictions. (The same guys behind the "AGI achieved internally" meme 2 years ago, basically). I don't remember any specific reasons to mistrust iruletheworldmo, though I do recall hearing him claim that "sus-column-r" on LMArena was AGI + OpenAI's next-gen model + capable of solving any problem he threw at it. It later turned out to be Grok 2. So, yeah.

avalancheGenesis's avatar

It's like a really shitty version of IABIED, with boring writing, poor analogies, hostile framing (AI mysticism comes across as honest wonder from Janus and co, here it's paint-by-numbers and yet somehow condescending), and a distinct lack of links to back up the vagueposting. Any regular Zvi reader would catch the general references to capabilities...but I don't have an eidetic memory, so it's hard to say if the exact way each paper is described is fully accurate. One suspects not. Spoilers: there is no shock twist, AI-written or otherwise. It's sort of irrelevant, humans are still perfectly capable of writing viral slop too. Not familiar with this particular twitter user, but he reads a lot like Ted Gioia to me. Classic American paranoid style, you love to see it...

avalancheGenesis's avatar

Did...the people on that subreddit not actually watch Stranger Danger Things? *I* recognized that shot, despite not really liking the show and avoiding the later seasons, precisely because it struck me at that time years ago as "that's a weird way to hold a gun" + "they're really trying too hard to invert tropes by giving Steve the silly melee weapon and Nancy the actual firepower". Plus the handwaving away of oh here's an ordinary teenage girl with no combat or firearms expertise landing all six right in the 10 ring during a highly stressful encounter under poor visibility, or however it went down, by that point I had no more disbelief to suspend. AI slop is demand-driven partially because people are (often willfully) ignorant of the human art canon that preceded it...Sad!

[insert here] delenda est's avatar

I think I've worked out Tyler Cowen. I think he has consciously decided that what he can best do here is marginally influence policy towards what he considers better policies.

I suspect that some of his posts would be different, in tone and substance, were that not his overriding priority.

Ofc I may, am perhaps most likely, wrong!

vectro's avatar

> You can securely connect medical records and wellness apps so responses are grounded in your own health information.

The use of "securely" here is a semantic stop sign, and usually in practice means the opposite of what the author is trying to say.

vectro's avatar

I should clarify that I haven't read anything about this besides what is in the present post.

However, I would bet that there's nothing secure about it; you are exposing your medical information to another party, who has the full ability to see, review, and use that data, from whom it could be leaked in a cyberattack, etc. In other words, this service is actually putting your data at greater risk than it would be otherwise, which is the opposite of "security"; but putting the word "securely" in there is intended to make the reader not think about any of these considerations and instead just nod and say, oh, great, I don't have to worry about the security of my information.

Drew's avatar

The US has very strong regulations about keeping health information secure. We have regulations like these so that everyone can trust that if a US company is injesting your health information, they are following VERY strict procedures about it.

Companies have very strong incentive to not mess this up, or they'll get sued for millions.

No one is forced to share thier own info but for those that do, they trust the regulatory regime and believe the benefits outweigh the risks.

vectro's avatar

I'm not sure what you are arguing. ChatGPT Health is not regulated by HIPAA, if that's your angle.

My observation that the word "securely" is a semantic stop sign is explicitly that the language here is trying to discourage you from actually thinking about the consequences of this sort of information sharing.

Jonathan Woodward's avatar

I don't know that taxes could be lowered in the event of AGI, either - a lot of current taxes are levied on individual humans, who may no longer be the main producers of value, and we also might want to be engaging in a lot more redistribution than we are right now.