Why is it so hard to measure productivity? - 5 minutes read




The weakest part of remote / part-time work research is how we assess productivity. For most research in this area, the job being measured barely qualifies for what is a “salaried email job”. It’s usually stuff like tech support, sales development, etc.

Why is it 2023 and we still can’t reliably measure software engineering productivity in a useful way? We know what we want our measure to proxy for - an overall outcome or goal, say a feature that is being built.

However, every potential metric we devise appears woefully inadequate in assessing this holistic outcome. Whether it's pull requests, lines of code, user stories, story points, or ship dates, it seems that every metric can be manipulated or gamed. Ship dates may be advanced, but quality suffers; story points morph in size depending on the project, and lines of code can be bulked up with a test suite. Even pull requests can be sliced and diced to skew the numbers. It's a frustrating conundrum.

For more fuzzy fields, like product management or marketing or design, it becomes even more hand-wavey. Some fields tend to depend on getting other roles to execute better, but you can’t go rewind history and try things with a different PM to see if things would have been better. Same with design.

Attributing marketing to sales is a classic problem in software that no-one really does particularly well beyond handwavey correlations (this many hits on this webinar translates to this many SDR calls translates to this ARR).

It’s as if we were trying to compare baseball players, but didn’t have any stats. Where is the Moneyball for employees?

It seems like this universal weakness is sort of glossed over but I think it’s actually a profound problem. This isn’t just a remote worker question - how do we know that things are getting done?

For example, how do you think about an employee who seems to be achieving stellar outcomes but doesn’t seem to work that hard? There seem to be two options:

That employee is “subpar”. They should be working 40 hours a week, but they’re working less and that is, in the generous case, a misuse of time, in the non-generous case, an abuse. They are getting paid for time that they are spending not working. That employee is “exceptional”. How many hours they work is immaterial. The whole point of being a salaried worker is that you don’t measure your hours day over day, and if you have an idea in the shower or while your kid is watching Moana then you can put that idea to use for your company. It’s perfectly reasonable and in some ways expected that several hours during the day are “dead time”.

Do you just give them more work to do? There is always an infinite amount of work - no knowledge work is ever fully done, there are always more features to add, there’s always another campaign to organize. But at least on a relative basis it seems that knowledge workers fall on a wide distribution of outcomes, and the people that achieve the most are definitely not the ones that work the hardest as measured by hours. If anything these numbers are sometimes correlated the other way - the people working the hardest are the ones working super inefficiently or who are flailing in some way.

If you give the most productive employee more work, presumably they’d be justified in asking for higher compensation? After all, they are driving greater outcomes for you. Would you be comfortable paying it?

For example, would you pay a 3x more productive designer 3x the fully loaded cost of the average designer? If 10x engineers truly exist, why do pay scales intra company not cover a 10x spectrum?

Perhaps this explains why the interviewing process at most companies is so ridiculous. Most software companies today run engineers through a bunch of hoops where they display coding skills which are not useful for the job at hand (i.e. recursion is extremely rare in shipped code but extremely common in interviews), plus assess personality using techniques that would make most fortune tellers wince.

Interview performance (at least for those hired) definitely does not correlate with on-the-job performance. But there isn’t a better interviewing process that we know of. So we continue using the least worst option.

My suspicion is that, like in other fields where performance matters and is financially rewarded, there will be a surge in our capacity to measure and evaluate real-life work performance. Compensation will become more closely tied to tangible accomplishments rather than arbitrary levels or seniority. Interviews will transition to be more real-world scenarios, perhaps within the customer's actual codebase, addressing a genuine problem the customer faces—possibly even compensating the interviewee for their time.

I know it’s been a while since my last post! Looking forward to writing more of these in the near future.



Source: Fractional.work

Powered by NewsAPI.org