Discussion about this post

User's avatar
blf's avatar

"If Dean is correct here, then the carbon cost of training is trivial." Note that Table 1 in that paper concerned models using 1e19 flops for training. If I understand correctly that's 1e6 times less than GPT-4, so that GPT-4 would have costed 1e5 tons of CO2 equivalents on the most inefficient setup listed in their Table 1 and 2e3 tons if efficient. No idea about inference costs. No idea about lifetime CO2e cost of the GPUs/TPUs.

Expand full comment
Oleg Eterevsky's avatar

Re Anthropic hiring less engineers: there're benefits to developing a project with fewer higher-performing engineers. Throwing as many engineers as you can at the project is not always the best thing that you can do.

Expand full comment
38 more comments...

No posts