Discussion about this post

User's avatar
blf's avatar

"If Dean is correct here, then the carbon cost of training is trivial." Note that Table 1 in that paper concerned models using 1e19 flops for training. If I understand correctly that's 1e6 times less than GPT-4, so that GPT-4 would have costed 1e5 tons of CO2 equivalents on the most inefficient setup listed in their Table 1 and 2e3 tons if efficient. No idea about inference costs. No idea about lifetime CO2e cost of the GPUs/TPUs.

Expand full comment
Oleg Eterevsky's avatar

Re Anthropic hiring less engineers: there're benefits to developing a project with fewer higher-performing engineers. Throwing as many engineers as you can at the project is not always the best thing that you can do.

Expand full comment
38 more comments...

No posts

Ready for more?