> r1 is good if you need it where it got better fine tuning like creative writing
R1 got better fine tuning for writing style only in the sense that it got *less* RLHF. It’s more base-model-like, less polished, less “refined”.
While Anthropic talks about racing to AGI to get a persistent advantage, it’s possible that their (and other American labs) heavy emphasis on safety makes them perpetually behind riskier labs when it comes to style. I know there is a niche that prefers Claude’s style, but most people find it, along with ChatGPT and Gemini, to be stilted.
Notice how o1-mini and o3-mini are stylistically regressions from GPT-4, which itself is a regression from DaVinci. Notice how in the demo of OpenAI’s latest release, Deep Research, they don’t even bother reading an excerpt of the 10-page report that it generates. Certainly, nobody wants to read those reports except to extract their actionable utility.
As of February 2025, it is still a deep mystery how to improve the writing style of AI models. The persona sculptors — eg Amanda Askell, Roon — are doing their best, but they are on the wrong side of the Bitter Lesson.
I wonder how much of Claude's "quality without a name" is really "complements you constantly, which you find consciously annoying but less consciously attractive".
> Altman: "the most important impact [of AGI], in my opinion, will be accelerating the rate of scientific discovery"
It's interesting that Altman is currently trying to raise $500B in data center investment - presumably with the promise of a return for the investors - and simultaneously saying that the most important result of this investment will be scientific progress. Of course, scientific basic research is famously unprofitable for the researchers - once the idea is out there anyone can verify and use it, it typically cannot be patented, and using it in secret is hard to pull off. So one has to imagine that the focus on scientific research is a bit of a diversion.
Iesus, do you ever rest?
Podcast episode for this post:
https://open.substack.com/pub/dwatvpodcast/p/o3-mini-early-days
Such a wonderful and thorough piece, thank you!
> r1 is good if you need it where it got better fine tuning like creative writing
R1 got better fine tuning for writing style only in the sense that it got *less* RLHF. It’s more base-model-like, less polished, less “refined”.
While Anthropic talks about racing to AGI to get a persistent advantage, it’s possible that their (and other American labs) heavy emphasis on safety makes them perpetually behind riskier labs when it comes to style. I know there is a niche that prefers Claude’s style, but most people find it, along with ChatGPT and Gemini, to be stilted.
Notice how o1-mini and o3-mini are stylistically regressions from GPT-4, which itself is a regression from DaVinci. Notice how in the demo of OpenAI’s latest release, Deep Research, they don’t even bother reading an excerpt of the 10-page report that it generates. Certainly, nobody wants to read those reports except to extract their actionable utility.
As of February 2025, it is still a deep mystery how to improve the writing style of AI models. The persona sculptors — eg Amanda Askell, Roon — are doing their best, but they are on the wrong side of the Bitter Lesson.
I wonder how much of Claude's "quality without a name" is really "complements you constantly, which you find consciously annoying but less consciously attractive".
I mean, I find it annoying and for me I assume it subtracts from that quality on net, perhaps others like it.
I always start with "be less obsequious" and it obeys, and I still like it a lot.
Many Thanks for all of the detailed reporting and analysis! And yes, also Many Thanks for the pointer to my benchmark-ette. :-)
Tried o3 mini in cursor and it just stopped generating output without warning multiple times. I'm back to Claude for a few days at least
> Altman: "the most important impact [of AGI], in my opinion, will be accelerating the rate of scientific discovery"
It's interesting that Altman is currently trying to raise $500B in data center investment - presumably with the promise of a return for the investors - and simultaneously saying that the most important result of this investment will be scientific progress. Of course, scientific basic research is famously unprofitable for the researchers - once the idea is out there anyone can verify and use it, it typically cannot be patented, and using it in secret is hard to pull off. So one has to imagine that the focus on scientific research is a bit of a diversion.
> I think this answer to ‘What’s the definition of a dinatural transformation?’ is a pass? I don’t otherwise know what a dinatural transformation is.
It's almost correct but got c and c' mixed up in a few places. The answer it gave doesn't actually typecheck, so to speak.