Discussion about this post

User's avatar
AI doom or what?'s avatar

Man, I'd love to see what Dwarkesh would ask you, how you'd answer, and how he'd respond; that would be a great podcast; can we make that happen, please?

Seta Sojiro's avatar

I agree it's confusing that Ilya implies that the age of research ever paused. I'm pretty sure that there has been more non-LLM AI research this year than any other (and that the same was true at the end of 2024). Deepmind and many other labs are constantly churning out new architectures*. And occasionally one surfaces to public awareness as a possible replacement for the transformer. But so far none of them have performed well at scale.

Maybe Ilya and his team have such good research taste that they can see what all of the other world class researchers missed. We'll see.

Relatedly, it could be argued that Anthropic has benefited from mostly focusing on scale, training methods and data quality rather than research into alternative architectures. They've spent much fewer resources than OpenAI or Deepmind, with a smaller team yet their models are on par.

*Titans, nested learning, large diffusion models for example.

13 more comments...

No posts

Ready for more?