o3-mini Early Days and the OpenAI AMA

Zvi Mowshowitz

Feb 3

New model, new hype cycle, who dis?

Read →

11 Comments

Alan Wake's Paper Supplier

Feb 3

Iesus, do you ever rest?

Expand full comment

Askwho Casts AI

Feb 3

Podcast episode for this post:

https://open.substack.com/pub/dwatvpodcast/p/o3-mini-early-days

Expand full comment

Tyler Corderman

Feb 3

Such a wonderful and thorough piece, thank you!

Expand full comment

Wow

Feb 3

> r1 is good if you need it where it got better fine tuning like creative writing

R1 got better fine tuning for writing style only in the sense that it got *less* RLHF. It’s more base-model-like, less polished, less “refined”.

While Anthropic talks about racing to AGI to get a persistent advantage, it’s possible that their (and other American labs) heavy emphasis on safety makes them perpetually behind riskier labs when it comes to style. I know there is a niche that prefers Claude’s style, but most people find it, along with ChatGPT and Gemini, to be stilted.

Notice how o1-mini and o3-mini are stylistically regressions from GPT-4, which itself is a regression from DaVinci. Notice how in the demo of OpenAI’s latest release, Deep Research, they don’t even bother reading an excerpt of the 10-page report that it generates. Certainly, nobody wants to read those reports except to extract their actionable utility.

As of February 2025, it is still a deep mystery how to improve the writing style of AI models. The persona sculptors — eg Amanda Askell, Roon — are doing their best, but they are on the wrong side of the Bitter Lesson.

Expand full comment

artifex0

Feb 3

I wonder how much of Claude's "quality without a name" is really "complements you constantly, which you find consciously annoying but less consciously attractive".

Expand full comment

Reply (2)

Zvi Mowshowitz

Feb 3

I mean, I find it annoying and for me I assume it subtracts from that quality on net, perhaps others like it.

Expand full comment

Arbituram

Feb 3

I always start with "be less obsequious" and it obeys, and I still like it a lot.

Expand full comment

Jeffrey Soreff

Feb 3Edited

Many Thanks for all of the detailed reporting and analysis! And yes, also Many Thanks for the pointer to my benchmark-ette. :-)

Expand full comment

Gary Mindlin Miguel

Feb 4

Tried o3 mini in cursor and it just stopped generating output without warning multiple times. I'm back to Claude for a few days at least

Expand full comment

Mark

Feb 4

> Altman: "the most important impact [of AGI], in my opinion, will be accelerating the rate of scientific discovery"

It's interesting that Altman is currently trying to raise $500B in data center investment - presumably with the promise of a return for the investors - and simultaneously saying that the most important result of this investment will be scientific progress. Of course, scientific basic research is famously unprofitable for the researchers - once the idea is out there anyone can verify and use it, it typically cannot be patented, and using it in secret is hard to pull off. So one has to imagine that the focus on scientific research is a bit of a diversion.

Expand full comment

Amicus

Feb 11Edited

> I think this answer to ‘What’s the definition of a dinatural transformation?’ is a pass? I don’t otherwise know what a dinatural transformation is.

It's almost correct but got c and c' mixed up in a few places. The answer it gave doesn't actually typecheck, so to speak.

Expand full comment

Don't Worry About the Vase

o3-mini Early Days and the OpenAI AMA