Discussion about this post

User's avatar
Cara Tall's avatar

the hyperstition element of it all just freaks me out so much...i suppose being directly integrated into twitter is what accelerates this feedback loop compared to other models? it seems like the level of influence+coordination it would take to influence this model in whatever way you want is shockingly low...i hope we don't see that kind of thing any time soon

Boogaloo's avatar

Wish you'd expand more on the abilities. Why does it seem to crush all benchmarks, (even private ones) but not perform much better subjectively? There are few benchmarks grok do not seem to top, and i've seen most of them now.

It tends to not crush the coding benchmarks but it doesn't seem optimized for that as they are releasing a coding model, but it seems to do well everywhere else

12 more comments...

No posts

Ready for more?