Discussion about this post

User's avatar
Alexander Barry's avatar

While I think it is plausible the results would have been different if the devs had had e.g. 100 hours more experience with cursor, it is worth also noting that:

- 14/16 of the devs rated themselves as 'average' or above cursor users at the end of the study

- The METR staff working on the project thought the devs were qualiatively reasonable cursor users (based on screen recordings etc.)

So I think it is unlikely the devs were using cursor in an unusually unskilled way.

The forecasters were told that only 25% of the devs had prior cursor experience (the actual number ended up being 44%), and still predicted substantial speedup, so if there is a steep cursor learning curve here that seems like a fact people didn't expect.

With that all being said the skill ceiling for using AI tools is clearly at least *not being slowed down* (as they could simply not use the AI tools), so it would be reasonable to expect eventually some level of experience would lead to that result.

(I consulted with METR on the stats in the paper, so am quite familiar with it).

Expand full comment
Dan's avatar

> AI coding is at its best when it is helping you deal with the unfamiliar, compensate for lack of skill, and when it can be given free reign or you can see what it can do and adapt the task to the tool.

It's funny to read this as a senior/staff level engineer because I find the complete opposite to be true. AI coding is most useful for me when I'm very familiar with the work to be done, I'm already capable of writing the code myself, and I'm basically using compute to go much, much faster than if I wrote all of the code by hand. I do this regularly now, and am indeed much more productive in this scenario!

Meanwhile, I have the _worst_ time with it when I'm working with a language/library/framework I don't understand or the prompts are too open-ended. Maybe initially it's fast and I speed through the discovery work, but every single time I do this, weird bugs start popping up that the AI struggles to fix, and then *I* struggle to fix it because I don't have a mental model of how the code works. Often I peek under the hood and discover all kinds of weird LLM-written abstractions and it takes a while for me to get my bearings. Surely this isn't just a prompting skill issue?

Expand full comment
39 more comments...

No posts

Ready for more?