Rock is Strong

Good old rock. Nothing beats that.

Feb 14, 2022

Response to (Scott Alexander): Heuristics That Almost Always Work

Everybody wants a rock. It’s easy to see why. If all you want is an almost always right answer, there are places where they almost always work.

The Security Guard
He works in a very boring building. It basically never gets robbed. He sits in his security guard booth doing the crossword. Every so often, there’s a noise, and he checks to see if it’s robbers, or just the wind.
It’s the wind. It is always the wind. It’s never robbers. Nobody wants to rob the Pillow Mart in Topeka, Ohio. If a building on average gets robbed once every decade or two, he might go his entire career without ever encountering a real robber.
At some point, he develops a useful heuristic: it he hears a noise, he might as well ignore it and keep on crossing words: it’s just the wind, bro.
This heuristic is right 99.9% of the time, which is pretty good as heuristics go. It saves him a lot of trouble.
The only problem is: he now provides literally no value. He’s excluded by fiat the possibility of ever being useful in any way. He could be losslessly replaced by a rock with the words “THERE ARE NO ROBBERS” on it.

That last line is making a bunch of hidden assumptions, in addition to the non-hidden assumption that there are (almost) never any robbers. All of them boil down to ‘the purpose of the security guard is to know whether there is a robbery and then act appropriately depending on the answer’ where the act if a robbery is detected could be ‘call 911’ and/or something more actively.

That is one good reason to hire a security guard. Yet something tells me that wasn’t the best description of why this man is doing crossword puzzles in the Pillow Mart in Topeka, Ohio.

My guess is that the security guard’s primary purpose is to make sure everyone knows that there exists a security guard. This has many nice properties, such as these Five Good Reasons.

Potential robbers know there is a security guard.
Employees know there is a security guard.
Insurance company knows there is a security guard.
In case of robbery you can say you had a security guard.
Anyone who tries to walk in faces mild social awkwardness.

When you have a security guard you get to tell the story that you have a security guard. Without one, you can’t tell that story. Dead rocks tell no tales, and when they try (such as writing “THIS IS A SECURITY GUARD” with or without adding “ALSO THERE ARE NO ROBBERS”) it is not very effective. Everyone knows it is a rock.

Dress up the rock to look like a person, put a security guard uniform on it, put a badge on it that says “THIS IS A SECURITY GUARD” without being quite that conspicuously trying too hard and prop up a crossword puzzle, plug in a Google Home you’ve programmed to respond to any voice activation with “THERE ARE NO ROBBERS” and now maybe you’re starting to get somewhere.

Some security guard jobs are bullshit jobs, while others are not. It is always important to know to what extent one has a bullshit job, especially when deciding how to do it. If your job is a bullshit job, your job is to shovel bullshit. If your job is something else, your job is to do something else.

If you have a bullshit security guard job, such as guarding the Pillow Mart in Topeka, Ohio, your job is to be the rock with a uniform and badge that can when prompted can mumble about there being no robbers that’s sufficient to satisfy the insurance company and make potential robbers think they’d have to expend some amount of additional effort. And sure if someone asks for directions and/or wants to come in you can check their ID and/or give them directions.

If you have a non-bullshit security guard job guarding something people might actually rob or mess with, then your job is to not only look like a security guard who might be paying attention, although that’s still important too, but to also actually pay attention in case there is a robbery.

Even if you are the second guard, you can still have a rock with the words “THERE ARE NO ROBBERS” and it will still be 99.9% right when you look at it. Even if you only look at it when you hear something and wonder a bit and think maybe you’ll check it out, the rock could easily still be 99.9% effective. For the right security guard that knows how to use that information but hasn’t fully internalized it yet, that’s a super valuable rock. If the security guard had been given no information and had a prior that it was 50% to be a robbery every time they heard the wind, probably good to have a rock around. Until there’s some super strong evidence of a robbery they can then check it out but know it’s almost certainly nothing.

Yes, technically the rock should say “THERE ARE ALMOST CERTAINLY NO ROBBERS” or “IT IS 99.9% TO NOT BE ROBBERS” and that would be even more useful to the right guard, but on the margin the basic rock is pretty good because when someone yells “STOP! THIEF!” or it is otherwise super obvious then the guard is still going to ignore the rock. The key is that you are not actually a rock. When you hear a bunch of stuff being smashed in a way that does not sound like the wind you know to adjust, that you don’t actually treat it as 100%.

The Doctor
She is a primary care doctor. Every day, patients come to her and says “My back hurts” or “My stomach feels weird”. She inspects, palpates, percusses and auscultates various body parts, does some tests, and says “It’s nothing, take two aspirin and call me in a week if it doesn’t improve”. It always improves; no one ever calls her.
Eventually, she gets sloppy. She inspects but does not palpate. She does not do the tests. She just says “It’s nothing, it’ll get better on its own”. And she is always right.
She will do this for her entire career. If she is very lucky, nothing bad will happen. More likely, two or three of her patients will have cancer or something else terrible, and she will miss it. But those people will die, and everyone else will remember that she was such a nice doctor, such a caring doctor. Always so reassuring, never poked and prodded them with needles like everyone else.
Her heuristic is right 99.9% of the time, but she provides literally no value. There is no point to her existence. She could be profitably replaced with a rock saying “IT’S NOTHING, TAKE TWO ASPIRIN AND WAIT FOR IT TO GO AWAY”.

This makes it more obvious that no, she doesn’t provide zero value and she couldn’t be replaced with a rock.

Let’s take this all at literal face value. She is right, not 99.9% of the time, but over 99.99% of the time.

A bunch of people enter her office nervous and scared. They leave knowing they have a nice, caring doctor, and with a story they can tell themselves and others about how much they care and that they did the responsible thing. None of that works if they look at a rock and take two aspirin. Did you think they were coming to her office to improve their health?

This doctor sounds way way better than the average doctor. Every day, patients come to her office, so let’s say eight hours a day, two patients per hour, five days a week, fifty weeks a year for forty years. One hundred sixty thousand patients who don’t have to be poked or prodded. Let’s say three of them die of cancer that she missed, one of which couldn’t have been prevented. The number needed to treat is eighty thousand. Meanwhile, how many false positives did she not send for tests and even unnecessary treatments? How many tumors that were mostly harmless got ignored?

If I told you there was something that had a one in eighty thousand chance of being a dangerous cancer that required treatment, and no one was looking, would you run off to your doctor and make sure they checked? If the doctor ordered it checked out further, would you suspect the reason was to avoid potential liability?

She is an excellent rock. Thanks to her extensive medical training, she knows how to reassure her patients and reduce their stress, which is likely more important for their health than catching three extra cancers over forty years.

And of course, she’s not an actual rock. If there was a giant thing on someone’s nose, she would think ‘oh I had better ensure someone actually examines that.’

The problem is if it becomes common knowledge that all she is doing is telling everyone to take two aspirin no matter what. She needs to seem to be a doctor practicing medicine. Otherwise the trick will stop working, the same way that the security guard rock needs to be man-shaped and put in a security guard suit that people don’t suspect is a man-shaped rock in a security guard suit.

The Futurist
He comments on the latest breathless press releases from tech companies. This will change everything! say the press releases. “No it won’t”, he comments. This is the greatest invention ever to exist! say the press releases. “It’s a scam,” he says.
Whatever upheaval is predicted, he denies it. Soon we’ll all have flying cars! “Our cars will remain earthbound as always”. Soon we’ll all use cryptocurrency! “We’ll continue using dollars and Visa cards, just like before.” We’re collapsing into dictatorship! “No, we’ll be the same boring oligarchic pseudo-democracy we are now” A new utopian age of citizen governance will flourish. “You’re drunk, go back to bed.”
When all the Brier scores are calculated and all the Bayes points added up, he is the best futurist of all. Everyone else occasionally gets bamboozled by some scam or hype train, but he never does. His heuristic is truly superb.
But - say it with me - he could be profitably replaced with a rock. “NOTHING EVER CHANGES OR IS INTERESTING”, says the rock, in letters chiseled into its surface. Why hire a squishy drooling human being, when this beautiful glittering rock is right there?

So I notice this does not work. He is not the best futurist. This time, we’re keeping score.

Occasionally one of those crazy weird things does happen. If all that The Rock is cooking is setting the probability of every possible change to epsilon, then when the first of those events happens his Briar score is suddenly going to explode and he is going to lose all his Bayes points.

It’s not even obvious he is the most likely person to be on the right side of 50% on a given question, because it’s often not rock-level to figure out what ‘NOTHING EVER CHANGES’ cashes out to in practice. Plenty of countries have collapsed into dictatorships and it has happened several times quite recently, so Rock hasn’t lost its bets on the USA yet on that one, but what were its odds in Hungary a few years back? Where would it set odds on ‘no country with at least five million people will fall into dictatorship in the next 10 years?’ What were its odds on whether a sitting president would refuse to accept the results of an election, or that a mob might try to attack the congress? Those seem like interesting things.

Not as interesting as a full American fall into dictatorship, but that seems like cherry picking. And if you asked him every year to give probabilities for cryptocurrency to get to where it is today, I hear it’s pretty hot in Bayes hell. I’m sure his prediction for big volume in NFTs looks rather ugly. And that’s what already happened, quite recently, using Scott’s examples.

(As for flying cars, yeah it’s been a good business, but it lost a ton of points back at Kitty Hawk and also when they started making cars, so how many distinct questions are running around at once and how long can this trick hope to last?)

The security guard has an easy to interpret rock because all it has to do is say “NO ROBBERY.” The doctor’s rock is easy too, “YOU’RE FINE, GO HOME.” This one is different, and doesn’t win the competitions even if we agree it’s cheating on tail risks. It’s not a coherent world model.

Still, on the desk of the best superforecaster is a rock that says “NOTHING EVER CHANGES OR IS INTERESTING” as a reminder not to get overexcited, and to not assign super high probabilities to weird things that seem right to them.

Good rock. Not as good as the next one.

The Skeptic
She debunks everything. Telepathy? She has a debunking for it. Bigfoot? A debunking. Anti-vaxxers? Five debunkings, plus an extra, just for you.
When she started out, she researched each phenomenon carefully, found it smoke and mirrors, and then viciously insulted the rubes who believed it and the con men who spread it. After doing this a hundred times, she skipped steps one and two. Now her algorithm is “if anyone says something that sounds weird, or that contradicts popular wisdom, insult them viciously.”
She’s always right! When the hydroxychloroquine people came along, she was the first person to condemn them, while everyone else was busy researching stuff. When the ivermectin people came along, she was the first person to condemn them too! A flawless record
(shame about the time she condemned fluvoxamine equally viciously, though)
Fast, fun to read, and a 99.9% success rate. Pretty good, especially compared to everyone who “does their own research” and sometimes gets it wrong. Still, she takes up lots of oxygen and water and food. You know what doesn’t need oxygen or water or food? A rock with the phrase “YOUR RIDICULOUS-SOUNDING CONTRARIAN IDEA IS WRONG” written on it.
This is a great rock. You should cherish this rock. If you are often tempted to believe ridiculous-sounding contrarian ideas, the rock is your god. But it is a Protestant god. It does not need priests. If someone sets themselves up as a priest of the rock, you should politely tell them that they are not adding any value, and you prefer your rocks un-intermediated. If they make a bid to be some sort of thought leader, tell them you want your thought led by the rock directly.

Notice she got one ‘wrong’ on fluvoxamine. So if that counts as wrong, to have a 99.9% success rate she needs to have written a thousand columns. I don’t come across a sufficiently known ridiculous-sounding contrarian idea all that often, so I’m guessing this is at best a twice-a-week column, so that’s ten years of mistake time. Also there’s that time in January when someone said this virus from Wuhan is going to send us all into lockdowns in a few months and she wrote a column calling them racist.

I really do not think she is going to have a 99.9% accuracy rate if she is going after this kind of reference class of idea. Will she be right 90% of the time? If she either draws a wide enough net and/or has a high enough bar for what counts as ridiculous, she will. But it’s not obvious to me she bats even that high. I very much doubt this is a 98% or 99% heuristic if she’s going after things like Covid treatments.

There is of course a version of this that does bat 99.9%. Or as Penn Jillette puts it on his and Teller’s quite fun version of this service, Penn and Teller: Bullshit, “I’ll debunk Ouiji Boards, and Teller here will shoot fish in a barrel.” So yes, you can go around debunking telepathy claims and bigfoot sightings all day, because they’re physically impossible.

But there’s nothing physically impossible about Ivermectin or Hydroxychloroquine working, and it was way too early to know which way it was going to go with this kind of confidence. Even today, you can be confident, but are you 99.9% confident Ivermectin is a bad idea? I’m not, I predict I never will be and if you think you are that confident I believe you’re making a mistake. As the first person to condemn such proposals, how often do you think it blows up in her face?

A rock that says “YOUR PREPETUAL MOTION MACHINE DOES NOT WORK” is fully 100% accurate. A rock that says “YOUR PHYSICALLY IMPOSSIBLE CONTRARIAN PROPOSAL DOES NOT WORK” is almost as strong. But a lot of true things are both contrarian and sound ridiculous. Not a lot compared to the number of such things that are false, but the ratio depends on the category boundaries, including one’s ability to determine what does and does not sound ridiculous. It does not sound like our skeptic is doing a good job drawing a boundary around this category even as a human.

Thus, when someone claims to be Priest of the Skeptic Rock, perhaps pay her a little bit more attention and potentially respect. The Rock’s ways are not super mysterious, but neither are they trivial to interpret even if one does not have to provide theology.

Yet the theology is where much of the value lies. If you read the Weekly Skeptic, yes it’s all going to be ‘this is not real’ and ‘these people are charlatans’ but James Randi didn’t become one of the high priests of the rock by quoting the rock. James Randi did it by being very very good at figuring out exactly how and why things that weren’t real weren’t real.

The Rock on its own is a terrible skeptic. Worse, it is boring. It writes a one-sentence column with space to fill in with today’s target, and it’s fast and usually right but it isn’t fun.

That does not mean this need be a Catholic rock. You too can decide these questions for yourself, but a good skeptic ‘does their own research’ to decide what is and is not ridiculous-sounding in the right ways, and how skeptical to be in a given spot. If you don’t want to do your own work, then yes you not only need a priest, you need a better priest than the writer of this Weekly Skeptic column. Because frankly she’s terrible at her job.

Except no, she isn’t. Her job is not to be right. Her job is to get clicks.

This is where we notice that her actual rock does not say “YOUR REDICULOUS-SOUNDING CONTRARIAN IDEA IS WRONG.” It actually says “VICIOUSLY INSULTING WEIRD-SOUNDING IDEAS THAT CONTRADICT THE NARRATIVE IS GOOD FOR BUSINESS.” Seems worth noticing the difference. When you defend the narrative and are wrong, there is implicit coordination to memory hole the whole thing, so our ‘skeptic’ is safe and everyone who doesn’t forget outright says ‘well of course she insulted fluvoxamine, look at how ridiculous-sounding it was at the time.’ So in the end, how often is she wrong?

The Interviewer
He assesses candidates for a big company. He chooses whoever went to the best college and has the longest experience.
Other interviewers will sometimes choose a diamond in the rough, or take a chance on someone with a less-polished resume who seems like a good culture fit. Not him. Anyone who went to an Ivy is better than anyone who went to State U is better than anyone who went to community college. Anyone with ten years’ experience is better than anyone with five is better than anyone with one. You can tell him about all your cool extracurricular projects and out-of-the-box accomplishments, and he will remain unswayed.
It cannot be denied that the employees he hires are very good. But when he dies, the coroner discovers that his head has a rock saying “HIRE PEOPLE FROM GOOD COLLEGES WITH LOTS OF EXPERIENCE” where his brain should be.

By assumption, The Interviewer hires very well. He is the best interviewer. He asks only two questions, so he can interview lots of people quickly in a low-stress way, and his hires work out.

Huge if true!

There are versions of the world where this would be true. Everyone else is focusing on good culture fits and growth mindsets but they’re bad at identifying such people. The colleges do a better job with their admissions process, and then they give skills and connections, and then experience gives more skills and connections. It’s not that the other information necessarily provides zero value in this scenario, but everyone who tries to use it ends up making worse decisions, the same way that we have studies where physicians do worse than AI systems at some forms of diagnosis even when told the AI’s opinion, because they overvalue their own ability to judge, except now they’re also competing against each other to hire the same fakers.

If that is true, then I totally want to use The Rock to make hiring decisions to the fullest extent possible under employment law. Sounds like this will work out great.

In our world, of course, the pure version won’t work out great because of the problem of adverse selection. The good Harvard graduates will get jobs elsewhere. The ones that this guy hires will be the ones that have been drifting from job to job on the strength of their college degree, not caring or learning or providing much value, because they’re the ones that need this job and he’s willing to hire them. People notice that his hires often don’t do much work and they seem more than a little creepy and it’s weird how things keep disappearing all the time.

But there’s a version of this that isn’t as stupid, and looks out for actual disasters, or takes the ones with experience from good schools and hires the top half of them on other metrics, or whatever. And maybe that system does work so long as not too many people know you are using it. If such folks realized they had a job here if they wanted it, then the adverse selection gets extreme. If there were a lot of such rocks going around there are those that would focus on getting the credential and nothing else, then spend their lives going from rock to rock, getting paid and accumulating status and never working a day in their lives.

(And indeed, there exist such people.)

There’s room for some such people using this rule. The more people use that rule, and the more obvious they are about using the rule, the worse the rule will work. If you’re the first person to realize that some colleges are better than others and people from them do better jobs, then that’s a huge leg up. If everyone knows and is rating it appropriately, you’re going to have a bad time.

Thus this is self-balancing. The right Rock will work in a given time and place, but it actually does work and is a good algorithm at that time and place, so it seems fine.

The other danger is if such folks learn (either in their colleges or elsewhere) that their job is to implicitly coordinate with others from prestigious colleges to take hire each other, take credit for everything and otherwise play corporate politics against everyone else, or against every else who doesn’t buy into their game, to the extent that others one could cooperate with are buying into this game. Thus, many of Rock’s hires look like great hires because they focus on how things look, and the more of them Rock hires, the more other similar people are hired and the better they all look because they control appearances more. The extent to which something like this is happening with a large portion of such hires is an important question, and points back towards the Moral Mazes sequence.

The Queen
She rules over a volcanic island. Everyone worries when the volcano would erupt. The wisest men of the kingdom research the problem and decide that the volcano has a straight 1/1000 chance of erupting any given year, uncorrelated with whether it erupted the year before. There are some telltale signs legible to the wise - a slight change in the color of the lava, an imperceptible shift in the smell of the sulfur - but nothing obvious until it’s too late.
The queen founded a Learned Society Of Vulcanologists and charged them with predicting when the volcano will erupt. Unbeknownst to her, there were two kinds of vulcanologists. Honest vulcanologists, who genuinely tried to read the signs as best they can. And The Cult Of The Rock, an evil sect who gained diabolical knowledge by communing in secret with a rock containing the words “THE VOLCANO IS NOT ERUPTING”.
Every so often an honest vulcanologist felt like the lava was starting to look little weird and told the Queen. The Queen panicked and ask everyone for advice. The Honest vulcanologists said “look, it’s a hard question, the lava seems kind of weird today but it’s always weird in some way or other, this volcano rarely erupts but for all we know this time might be the exception”. The rock cultists secretly checked their rock and said “No, don’t worry, the volcano is not erupting”. Then the volcano didn’t erupt. The Queen punished the trigger-happy vulcanologist who sounded the false alarm, grumbled at the useless vulcanologists who weren’t sure either way, and promoted the confident cultists who correctly predicted everything was okay.
Time passed. With each passing year, the cultists and the institutions and methods of thought that produced them gained more and more status relative to the honest vulcanologists and their institutions and methods. The Queen died, her successor succeeded, and the island kept going along the same lines for let’s say five hundred years.
After five hundred years, the lava looked a bit weird, and the new Queen consulted her advisors. By this time they were 100% cultists, so they all consulted the rock and said “No, the volcano is not erupting”. The sulfur started to smell different, and the Queen asked “Are you sure?” and they double-checked the rock and said “Yeah, we’re sure”. The earth started to shake, and the Queen asked them one last time, so they got tiny magnifying glassses and looked at the rock as closely as they could, but it still said “THE VOLCANO IS NOT ERUPTING”. Then the volcano erupted and everyone died. The end.

So this is straight out of Meditations on Moloch and the Moral Mazes sequence.

Most centrally, this sounds like the Queen’s problem. The Queen said she wanted to be alerted when the volcano might be about to erupt and have people who could evaluate details, but what she actually rewarded was telling her the volcano was never going to erupt and did not look at details at all.

Or: Help me tune my machine learning algorithm, I punished it for approving bad drugs and then it stopped ever approving any drugs, easily curable diseases are running rampant and my family is dying.

I do not have a lot of sympathy, it is not obvious the Queen considered the potential eruption the problem rather than considering everyone being nervous to be the problem, and also The Rock has three more words on it.

It actually says “TELL THE QUEEN THE VOLCANO IS NOT ERUPTING.”

It says that because the volcanologists were wise and noticed that the Queen may have created the institute with the task of detecting eruptions but mostly wanted to be able to tell herself and her subjects that she had created the institute, the same way that the Pillow Mart in Topeka, Ohio wanted to tell the insurance company they had hired a security guard.

At first, even after the pattern became clear and The Rock was commissioned, the wise volcanologists were split, and some of them decided it was their sacred duty to tell the Queen other things anyway sometimes. There was also a faction that said “all right, sure, we stop telling The Queen whenever there’s a 1% chance, but we should still keep studying the art of volcanology and when it gets to 10% or at least when it gets to 25% or 50% we still tell The Queen about the situation, right?” But after a while, those volcanologists were too busy studying the volcano and running experiments while the others were engaging in political battles and throwing parties, and also every time they said anything to the Queen she punished them so no one wanted to help them if they were doomed to eventually get yelled at, and they lost out on the good jobs and control of the hiring, and more and more of the budget got devoted to the parties, and that was that.

A highly unoriginal part of the Moral Mazes thesis is that this happens in the long run to every such organization barring an extraordinary effort. The people whose primary goal is to advance within the organization are the ones who end up in control of the organization. They do not care about the original mission, so the original mission is increasingly neglected until this threatens the organization’s ability to exist.

In this case, it only threatens the organization’s ability to exist when the volcano erupts and kills everyone. Until then, it’s an active advantage. There was never a chance this would last 500 years.

If you want this to last 500 years and have any chance of detecting the next eruption that far out, you’ll need to do a great deal better. The Queen needs to not instinctively punish anyone who warns about a possible eruption, and instead let others evaluate whether or not it was a justified warning, and reward it if it was, while punishing failure to notice.

The Queen needs to keep track of how often alarms should go off, and get very suspicious if they go too many years (or generations) without a warning, and at some point assume everyone has stopped caring, fire or hang everyone involved and re-found the institute with new people.

And/or have three institute departments that don’t talk to each other but each check the volcano, and when two of them warn her but not the third, punish the third. And also probably have them do a bunch of other prediction and science tasks that keep everyone involved trained. And every so often maybe dump some strange-smelling stuff next to the volcano and have someone claim to have noticed something weird and then see what reports come back on the situation. Or something, anything, preferably a lot of different things in unpredictable fashion.

Otherwise, I can only conclude The Volcanology Institute Is Not About Volcanos.

The Weatherman
He lives in a port town and predicts hurricanes. Hurricanes are very rare, but whenever they happen all the ships sink, so weathermen get paid very well.
If you’ve read your Lovecraft, you know that various sinister death cults survived the fall of Atlantis, and none are more sinister than the Cult Of The Rock. This weatherman was an adept among them and secretly communed with a rock that said “THERE WON’T BE A HURRICANE”.
For many years, there was no hurricane, and he gained great renown. Other, lesser weathermen would sometimes worry about hurricanes, but he never did. The businessmen loved him because he never told them to cancel their sea voyages. The journalists loved him because he always gave a clear and confident answer to their inquiries. The politicians loved him because he brought their town fame and prosperity.
Then one month, a hurricane came. It was totally unexpected and lots of people died. The weatherman hastily said “Well, yes, sometimes there are outliers that even I can’t predict, I don’t think this detracts from my vast expertise and many years of success, and have you noticed some of the people criticizing me have business connections with foreign towns that probably plot our ruin?” An investigation was launched, but the businessmen and journalists and politicians all took his side, and he was exonerated and restored to his former place of honor.

Let’s say The Rock is right 99.9% of the time, since that seems to be the Rule of Rocks these days, and let’s say he checks it once a week. Thus, there is a hurricane roughly once every twenty years.

It sounds like worrying about hurricanes was rather expensive. Sea voyages often get cancelled. Without such worries, the town gained fame and prosperity.

This is a rather large effect. When the hurricane finally did come, yes the ships sunk and a lot of people died, but all the businessmen and journalists and politicians were cool with it. So presumably that meant that even after the hurricane the town was doing pretty well. Otherwise the politicians and businessmen are very much not going to be down with this.

There’s also another possible explanation, which is that weathermen mostly suck at predicting hurricanes.

Have you ever seen real weathermen predict hurricanes? No. That’s not a thing. Sure, they say ‘it’s hurricane season so there are going to be some hurricanes’ and they’re usually right but that very much does not count. Once a tropical storm exists and they can see it they predict how big it will get and where it is going, but that’s a high-percentage play. There isn’t a lot of ‘three weeks from now we think there’s a chance a 50% chance a hurricane is going to hit Miami.’

Whereas it sounds like other weather forecasters were essentially making those kinds of predictions often enough to seriously hurt business, and it sounds a lot like they were mostly wrong. We have no evidence they were better than random, or that the precautions taken were net useful.

Instead of a weatherman who could plausibly have useful information, imagine an ancient tribe trying to predict a hurricane. The shaman throws bones every month, and if they land in the wrong configuration she warns of a terrible hurricane and then everyone does the ‘please don’t kill us’ prayers and ties down their stuff and then usually nothing happens because all she’s doing is throwing bones and what someone needs to do is go get a rock and carve on it “THERE WON’T BE A HURRICANE” because even if there is a hurricane it’s not like the bones were going to predict it.

Thus I applaud this Weatherman. Like in The Phantom Tollbooth, he’s not actually a Weatherman, he’s more of a Whetherman. He is asked whether people should worry, and he does the right calculation and says no.

There are plenty of things like this, where the value of information of warnings is negative. Knowing you have cancer is highly useful if it can be usefully treated. If it can be wastefully treated in ways that will make you miserable and cost lots of money without doing much for your lifespan, and your doctors and family are going to push you to do that and you’ll feel guilty if you don’t, then you really don’t want to know. If a Covid test showing your four year old has an asymptomatic case would force lots of people to quarantine in stupid fashion because the rules are so over-the-top as to be counterproductive, maybe don’t test when you don’t have symptoms, and maybe don’t worry if the test is being done in a way that generates a bunch of false negatives. Your mother checks the weather channel, sees a 20% chance of light rain and tells you that you have to take an umbrella. One shoe bomber and we all take off our shoes at airports for a decade. And so on.

Thus, the Whetherman in question has been revealed to be doing a good job rather than a bad job. The alerts simply aren’t very specific and there are a lot of false positives, so when others warn of a hurricane there’s still only a 2% chance that it will happen at all, and it’s not worth changing your behavior for that. Voyages should continue, life should go on. But for various social reasons, we can’t do that explicitly. We can’t simply say we’re going to ignore the signs. So instead we hire someone and pay them a lot of money to give warnings, while hoping they notice that their job is bullshit and their real job is to consult the rock.

There is of course a version of this where it’s a 20% chance rather than 2%, and he got lucky for a while, and actually there were quite a lot of deaths and sunken ships and everyone is a lot worse off and he should have been fired or worse. But I am guessing we are instead in world caught in a safety trap and he did the town a favor. In those other worlds, I’m guessing that if he could actually be expected to predict hurricanes he would get to walk off into the sunset with his 20 years of fame and generous pay, but he does not get his old job back.

This raises the question of the bankers who in 2008 were caught holding a rock that said “HOUSING PRICES ALWAYS GO UP.” They had to get bailed out, which indicates the rock wasn’t being socially responsible, but instead one should examine the rock and it says “IF HOUSING PRICES GO DOWN YOU WOULD GET BAILED OUT.” Which is a different, perhaps smarter, rock.

This also all depends on how bad hurricanes are. A bunch of ships sinking and people dying is not great, but if the metaphorical hurricane is an existential risk like an unsafe artificial general intelligence or engineered plague that kills everyone, the calculation looks very different. If the ‘hurricane’ is mostly harmless but everyone insists on freaking out about hurricanes, or is dangerous and causes freak outs but those freak outs do nothing useful, the wheatherman in question is a superstar, and should be paid very well indeed.

Rock is Strong

Scott worries that experts who are charged with spotting rare events will, instead of doing the real work, end up relying on the 99.9% accurate heuristic, and that this will be bad. Here’s his explanation.

Maybe this is because the experts are stupid and lazy. Or maybe it’s social pressure: failure because you didn’t follow a well-known heuristic that even a rock can get right is more humiliating than failure because you didn’t predict a subtle phenomenon that nobody else predicted either. Or maybe it’s because false positives are more common (albeit less important) than false negatives, and so over any “reasonable” timescale the people who never give false positives look more accurate and get selected for.

You say ‘stupid and lazy’ and I say ‘respond to incentives’ and ‘cannot actually do much better than the heuristic if they aren’t allowed to give probabilistic answers.’ Or ‘the heuristic isn’t that easy to implement, you try it.’ You say ‘didn’t follow a well-known heuristic even a rock could follow’ and I say ‘didn’t do the job they were actually hired to do.’ You say ‘look more accurate and get selected for’ and I say ‘evaluate on results rather than process.’

(Also, Scott says ‘99.9% accurate’ and I say ‘I very much doubt that.’)

More than all that, Scott worries that And That’s Terrible.

This is bad for several reasons.
First, because it means everyone is wasting their time and money having experts at all.
But second, because it builds false confidence. Maybe the heuristic produces a prior of 99.9% that the thing won’t happen in general. But then you consult a bunch of experts, who all claim they have additional evidence that the thing won’t happen, and you raise your probability to 99.999%. But actually the experts were just using the same heuristic you were, and you should have stayed at 99.9%. False consensus via information cascade!
This new invention won’t change everything. This emerging disease won’t become a global pandemic. This conspiracy theory is dumb. This outsider hasn’t disproven the experts. This new drug won’t work. This dark horse candidate won’t win the election. This potential threat won’t destroy the world.
All these things are almost always true. But Heuristics That Almost Always Work tempt us to be more certain than we should of each.

You say ‘wasting their time and money on experts,’ I say ‘giving people peace of mind and some combination of social and legal cover to do what they want to do anyway without blame’ and ‘the heuristic has non-obvious detail the experts are evaluating’ and ‘think of all the money we save by having the experts work so quickly.’

You say ‘building false confidence’ and I say that’s mostly on us not the ‘experts.’

Once I notice that there is a heuristic this accurate, I should also notice that the experts correlate highly with such accurate heuristics, and that they are probably not that much more accurate than the heuristics, so I mostly should not update much. At this kind of extreme it’s not even an issue. If an ‘expert’ tells me X is false but the basic heuristics say X is 99.9% false, what’s my new probability? If the expert told me this without my asking directly, probably lower than 99.9%, because the fact that they felt the need to tell me is more important than their opinion here.

A more scary situation is if the heuristic clocks in at 90% and the experts copy it, and now I’m going to plausibly end up substantially higher than 90%, but I definitely noticed both so I should notice that there’s some causation there in terms of the heuristic being known to the experts. But yes, I should totally update a substantial amount, because the experts are at least confirming that I have correctly assessed the reference classes involved. Couldn’t be sure about that.

What the heuristic is actually for here is to push the experts towards it, because those who don’t consider the heuristics end up not aligning with them enough, and they need to update and keep in mind that one should not go against it lightly, instead considering other possibilities first. And also because forgetting or neglecting the heuristic will waste a lot of valuable time. Dr. House often reminds us that it’s never lupus, but one time it is lupus, and he does eventually figure this out in the end.

There’s also the strong possibility, which Scott does not consider but that seems prominent in most of Scott’s examples, that the expert is actually being paid mostly to use the heuristic so it can be seen as coming from an expert and people can be assured there is a ‘human in the loop’ who is checking for the obvious exceptions. Such jobs are mostly or entirely bullshit jobs.

Often you don’t care about knowing, but you don’t want others to know you don’t care about knowing. If they knew you didn’t care (and didn’t care about others learning you didn’t care and so on) then that could mean you don’t care, didn’t take precautions or are otherwise blameworthy if things go wrong. Other times, your ignorance can be exploited, and they can rob the place.

Mostly this is all a case of People Respond to Incentives. I notice that many of the plays here don’t require that the heuristic be accurate, certainly not at anything like the 99.9% level. They only require that payoffs not properly correspond to outcomes. I don’t need an accurate heuristic to know that if warning about the volcano (or cancer, or housing prices, or a robbery, or anything else) only gets me in trouble, it doesn’t matter what I suspect with what probability, there’s a right answer to ‘what should I say in this spot?’ and that is what you are probably going to hear. And over time, people figure out that sort of thing quite well.

I also notice Scott’s examples are full of vast overconfidence. Who are these people who have 99.9% accurate heuristics while constantly making important mistakes? That’s quite a neat trick.

That leads into the comparison to black swans and tail risks. Sometimes the issue here is the assumption of tail risk and the possibility of a black swan. When the volcano erupts, the hurricane comes or the market crashes, or the invention changes the world, the impact of that is huge, so always saying ‘no’ is almost always right and also a bad trade that gets super expensive, or is effectively counting on a bailout where the person in question will not end up having to pay the bill that comes due.

I agree with Scott that this is a subset of what is happening here. Even in the cases where there is a tail risk that people are laying off on others, that problem alone is only some of what has gone wrong, and would not alone cause the same level of pickle. In other cases it is entirely distinct from the issue. Giving everyone a mysterious impossibly high 99.9% accuracy made this seem more central than it was.

When the problem is the failure of such heuristics to price in the costs of tail risks, that can be a rather large mistake, but also sometimes the tail risk is being overpriced rather than underpriced. In those cases where there are real and contextually large tail risks, such as artificial general intelligence or more typically with a trading algorithm, there is a big problem. Other times, the big problem is the perception of a potential big problem, and the actual problem is small.

I also wanted to ensure I pointed to what I see as several conceptual errors on Scott’s part, that I worry are indicative of fundamental model errors - the idea that simple-sounding heuristics are actually simple to execute in practice, the assumption that jobs are not bullshit and that people want those doing those jobs to be accurate rather than do something else, especially the symbolic and signaling values involved, and the failure to think about the value of information and the cost of getting that information. There is a kind of ‘naïve mistake theory’ here that is sufficiently naïve that it does not seem like a mistake.

I have split off the postscript to this, entitled Paper is True, into its own post.

Cole Terlesky

Feb 14, 2022Edited

I was thinking about this when Scott's article came out. I used to be a lifeguard. Most of my job consisted of watching a pool of people where no one drowned.

In some ways it gets far worse odds than the volcanologist problem. How often you need to check makes a difference. As a lifeguard those checking intervals are every 15-30 seconds. How often do the volcano need to be checked? Once a day? Once a week?

To some extent I was definitely serving the same purpose as a security guard, to make people feel safe, to lower insurance costs, etc. But people did occasionally start drowning, and I did have to jump in and save them.

I did some back of the envelope math based on how often I saved people and how often I had to check the pool. The odds were much lower than a 1 in a 1,000 chance that someone was drowning. It was more like a 1 in a 100,000 chance that someone was drowning while I was checking the pool. I was a teenager while I was a lifeguard, many other lifeguards I worked with were also teenagers. No one drowned at any of the pools I worked at, and we made a few saves each summer.

I can't help but think that being right 99.9% of the time when being wrong is catastrophic is actually a really crappy record. If I had been a lifeguard that was only right 99.999% of the time, there would have been at least one dead kid.

4 replies by Zvi Mowshowitz and others

Ben Hoffman

Feb 16

>A highly unoriginal part of the Moral Mazes thesis is that this happens in the long run to every such organization barring an extraordinary effort. The people whose primary goal is to advance within the organization are the ones who end up in control of the organization. They do not care about the original mission, so the original mission is increasingly neglected until this threatens the organization’s ability to exist.

We happen to live at a time when there are massive subsidies for make-work, can't really generalize that well from that about what happens in more pronormative circumstances, and we have records from the recent past of institutions often behaving quite a bit better.

10 more comments...

Don't Worry About the Vase

Discussion about this post

Ready for more?