Share this comment
True. At this point most of the unsaturated benchmarks in math and coding are what I consider "light" ASI anyway, by which I mean problems that <0.1% of humans can solve. ARC is a notable exception. I think it's time to get more ambitious about the problems that we're framing, both because we need more unsaturated benchmarks and because …
© 2025 Zvi Mowshowitz
Substack is the home for great culture
True. At this point most of the unsaturated benchmarks in math and coding are what I consider "light" ASI anyway, by which I mean problems that <0.1% of humans can solve. ARC is a notable exception. I think it's time to get more ambitious about the problems that we're framing, both because we need more unsaturated benchmarks and because we need to stop searching under the street lamp. We currently don't know how to create an objective function for training on say solving aging, but we're not going to figure it out until we actually work on it.