Metrics, Incentives, and the Seduction of Clarity
In my work as a podcast producer, I get a lot of questions about metrics.
It just so happened that a couple of weeks ago, a fairly green podcaster brought me a pile of beautiful numbers arranged carefully in rows and columns. Their question was essentially, “What does this mean?”
The answer to this particular podcaster's question was "Not much." The data they had didn’t translate into much useful information for a variety of reasons: insufficient time, lack of scale, and very little to compare it to. They were very gracious about accepting my complete non-answer. But I know it's frustrating to have this data seemingly at your fingertips without having any story to go with it.
Podcasters want to know how they should feel about the number of downloads they're receiving or why their monthly download number was lower one month than another. And even our most analytical podcast hosts can have a hard time separating desired results (like new clients) from inconsequential metrics (like downloads).
Of course, this isn't unique to podcasters.
What job or activity isn't dominated by a dashboard of numbers that go up or down?
It's the first thing I see when I go to compose a new post on Substack. LinkedIn sends me a notification to let me know how many people saw my latest posts. Before I ghosted Instagram, I used to obsessively check the number of shares a post received in the first few days it was live.
My watch keeps track of the number of miles I've walked or run, the speed at which I've walked and run them, and the estimated VO2 max for each activity. The New York Times tells me how fast I did today's crossword and whether or not I'm a "genius" depending on how many Spelling Bee words I've found.
I won't waste (much) more of your time describing the ubiquity of metrics in our daily lives. I also don't want to spend (much) time talking about the consequences of obsessing over metrics. I think those subjects have been covered quite a bit. But coverage of the potentially harmful effects of the data-fication of our lives often misses a critical piece of the puzzle:
Metrics become incentives.
And those incentives warp our choices and behavior.
The reason "what gets measured gets managed" is that moving the number in the desired direction becomes a reward in itself. We get that little dopamine rush that comes from seeing the number tick up or down. And in the process, we tacitly endorse unhelpful and even harmful action for the sole purpose of seeing the number move.
Here's an example.
Very early in my tenure as a podcaster, I learned that you could "double" the downloads your show received simply by releasing twice as many episodes. And you didn't even need to make more episodes! You could just take your back catalog and release one old episode along with your new one each week!
The reason this works (or was at least plausible) is that the bulk of a podcast's downloads come from subscribers. And typically, subscribers have episodes downloaded automatically to their devices. So whether they wanted to listen to the rerun episode or not, subscribers would find it on their devices.
A podcast might receive double the downloads—but it didn't actually mean that more people were listening to the show. It just meant that subscribers were leveraged to change a number.
I'm sorry to say that I was taken by this ridiculous scheme. I released a new episode every Tuesday and a rerun episode every Thursday. Initially, yes, my downloads went up—although they never doubled. But it didn't take long before this release schedule was pissing people off.
Now, there's nothing wrong with releasing reruns periodically. I've done it several times this year. But releasing them for the purpose of juicing your download numbers? That's ridiculous. I really wish I could have seen that at the time, but I really wanted to increase my downloads even though I wasn't seeking outside advertising (the main reason you'd consider downloads as a metric in the first place). But that was just a downstream effect of how the scheme was invented in the first place.
Someone—I don't remember who—was so interested in boosting their downloads that they realized they could exploit a kind of loophole. They eked out a small advantage by leveraging that automatic download loophole and doubled their downloads. Further, they exploited this idea to garner attention from people who, like me, also wanted to influence that magical number.
I'm sure the initial advantage didn't last long, though. And they were probably off to exploit another loophole for the purpose of juicing their numbers in short order.
This, by the way, is how much of late capitalist business growth happens.
A company identifies a small advantage—often by playing fast and loose with metrics (e.g., page views, labor costs, cost of goods, etc). Then it jumps on that advantage and rides it for as long as it can—often just until the next earnings call. Then it finds another small advantage, and the process repeats.
This allows executives and shareholders to accumulate wealth while everyone else gets squeezed. The company doesn't work any better. It doesn't create more value outside of financial markets. It's just more successful on paper. The metrics that matter to investors become incentives to executives.
This isn't to say that making choices based on metrics always results in short-sighted behavior. But picking the wrong metric—the one that's not your real goal or one that you misunderstand the meaning of—will result in short-sighted behavior. The problem here is that most metrics, especially the highly visible ones, are the wrong metrics.
And if you operate at a scale that doesn't allow for meaningful data, well, then every quantifiable metric you have is the wrong metric. That is, unless you’re dealing with numbers in the thousands, you probably don’t have statistically significant data to make decisions with.
To use my cringy example from earlier, I allowed the number of downloads the podcast received to stand in for my goal of reaching more people. Building an audience and seeing higher download numbers aren't necessarily correlated—especially if you're looking at, say, monthly downloads rather than downloads per episode over time. I picked the wrong metric, imbued it with the character of an incentive, and then, rationally, chose the wrong tactic to move it.
I got the dopamine hit—but very little else.
Okay, so when does a metric become an incentive? And how does a metric become an incentive?
Truthfully, these questions have been on my mind for a few months. I listened to several podcasts about Sam Bankman-Fried, the criminally-indictedformer CEO of the crypto exchange FTX. SBF, as he's known, was also one of effective altruism's biggest supporters. And there’s reason to believe that SBF’s desire to do good led him to make disastrously bad financial choices.
The legal system will have to decide whether SBF was reckless, a fraudster, or “merely” way too confident in his own abilities. But one interpretation of his behavior is that he intended to make as much money as possible for the purpose of giving it away to projects that fell under the effective altruism banner. When he got way out over his skis with that scheme, everything collapsed. Innocent people lost money. And now that FTX is in bankruptcy, some of the money SBF gave away to researchers and charities may be clawed back to make investors whole.
Effective altruism is a philanthropic philosophy that essentially prescribes giving money to projects that do the most good for the most people. The "most good" and "most people" are functions of metrics. Effective altruists use platforms like GiveWell and Giving What We Can to direct their giving based on the groups’ research into the world’s biggest—and most fixable—problems. Their goal is to maximize the impact of their dollars. There are plenty of big-name supporters of the EA movement—and there are also plenty of detractors and thoughtful criticism.
I won't get into that now (maybe in the future), but the idea that philanthropy can be measured objectively for the purpose of optimization and maximization is salient to our question about incentives.
If you measure your positive impact on the world in terms of the number of mosquito nets distributed in sub-Saharan Africa, you're more likely to funnel as much money toward the purchase of mosquito nets as possible. It feels good to have a direct impact on that data point. But if you measure your positive impact on the world through a set of subjective questions about the effects of your action on the well-being of people throughout the world, well, you're going to have to do a lot of work. What's more, at the end of all that work to figure out your impact, you won't have much confidence in your result.1
Effective altruism is associated with another philanthropic scheme—earning to give. The idea is that it’s preferable to choose careers that maximize earning and avoid jobs that, while beneficial or even “noble,” don’t pay as well. For instance, they might say that it’s better for a lawyer to take a job in corporate law than to become a public defender. Sure, a public defender does work beneficial to the public. But the corporate lawyer might make orders of magnitude more money—which means they can give away more money to charities that benefit more people than a single public defender could ever help.
SBF figured out how to make a ton of money in crypto so that he could give it away. And he figured that he and his company would keep earning money to give away, which led him to rationalization giving more money away than he currently had. This is no Robin Hood story. It’s a story of misplaced incentives leading where misplaced incentives lead. SBF ostensibly believed that any negative impact of his earning actions were outweighed by the positive impact of his giving actions.
The clarity effective altruism provided incentivized harmful behavior.
Philosopher C. Thi Nguyen calls the flattening of a complex system of values “value capture.” Value capture occurs when a nuanced understanding of what's important is squeezed into a simple data point.
One example Nguyen cites is GPA—grade point average. GPA is designed to provide an objective measure of a student's performance and offer a way to compare one student's performance with another. Of course, student performance—let alone student learning—is a complex set of values. Different instructors will value performance in different ways—memorization, description, analysis, application, etc. Different students will perform differently depending on the learning context and performance measures. GPA, though, turns all of that complexity into a number between zero and four.
Students may then choose different strategies to boost their GPAs, such as taking easier classes or choosing instructors known to grade generously. Nuanced, complex ways of measuring performance are much harder to manipulate. But they also require many more resources to ascertain, record, and analyze. What's more, simplified metrics like GPA provide what Nguyen calls the "seduction of clarity."
We get the feeling that we know exactly what's going on—even when we know no such thing.
As the world has gotten more and more complex, our opportunities to experience clarity have become more and more rare. That is if we're being honest about what we understand and what we don't understand. But, instead, we're constantly being seduced by the clarity of data: podcast downloads, inflation rate, page views, GDP, new subscribers, unemployment rate, post shares, housing prices, etc... Scroll past whatever political or tech scandal is the top headline, and you'll see a story about data.
I talk to vanishingly few people who don’t get a little squirrelly about metrics. I regularly hear about projects gone wrong, vacations ruined, and harmful choices made, all in the name of improving some data point. Critic Rebecca Solnit calls our metrics-obsessed milieu the “tyranny of the quantifiable.” Others have called it managerialism. Still others might offer slightly more holistic takes and call them “life hacking.”
Solnit writes:
The tyranny of the quantifiable is partly the failure of language and discourse to describe more complex, subtle, and fluid phenomena, as well as the failure of those who shape opinions and make decisions to understand and value these slipperier things. It is difficult, sometimes even impossible, to value what cannot be named or described, and so the task of naming and describing is an essential one in any revolt against the status quo of capitalism and consumerism.
It takes intention and sustainable effort to avoid the managerialist worldview. “Quantify everything” is a chief tenet of our faith in growth and progress. An effect that can’t be measured isn’t an effect at all. At the end of the day, we’re trying to survive a system that grinds us down.
The best way forward may be one in which we refuse any quantification. I know that sounds drastic. And I’m not entirely sure that I mean it. But given our economy and culture, the siren song of data may be too seductive to deny unless we stuff our ears with wax or lash ourselves to the mast.
I have a hunch that we all know how to do good, effective work without relying on metrics to pat us on the back for a job well done.
Writing in The Atlantic, Derek Thompson wrestles with how to do the most good with his money. He writes:
Philosophically, the most difficult task facing GiveWell is putting the vast spectrum of human suffering into numbers. It is, in a way, a math problem, but one laden with value judgments, about which reasonable people can disagree.
For my money (pun intended), it’s not a math problem. Nor is it a difficult task. Human suffering and the many things that can and should be done to address it simply isn’t a quantifiable issue.