Statistics, Probability, and Inference

Long ago I wrote very briefly about ‘Words of Estimative Probability’, an useful tool of the intelligence field (and others!) to codify otherwise potentially ambiguous and weaselly expressions of chance, like “probably” and “most likely” and “highly improbable”. It’s worth noting that codifying WEPs doesn’t tie them down to specific percentages; there’s still usually a good degree of “weasel room” inherent in the system. (“Probable” == “more than a fifty percent chance.” Um, yeah, helpful, huh?)

I got to talking about this stuff with some people I know, and the question came up as to why, if you’re going to bother to codify what WEPs mean in terms of percentages, you don’t just skip the WEPs and use percentages?

I’m not sure there are any good answers to that. But I think there are some interesting aspects of the issue that deserve discussion. So…

The “real answer”, by the way, is probably simply that using lots of percentages in written works is annoying and user-unfriendly. No real magic or mystery, I know. Sorry.

(A possible extension of this is that it’s not uncommon in the intelligence community to see statements like “The authors have moderately high confidence that the scheduled elections are likely to take place”, which looks like good, conscientious analysis at first glance, and probably is. If you replaced the qualifiers with percentages, though – “The authors are 60% sure that the scheduled elections have a 60-70% chance of taking place” – things can get very confusing, because not everyone will interpret a 60% confidence in a 60-70% chance the same way, and the whole point is to be understood clearly and without ambiguity. Does your head hurt, yet?)

A more complicated answer – and one just as real, for a given value of “real” – is that humans probably don’t treat all percentages equally. What I mean by this is that we (or at least I) tend to infer certain things about some numbers that may not be intended – or, indeed, intuitive.

“Even odds” means things could go any of a number of possible ways, pretty much by definition, and none is more likely than another. Will a new domestic Islamist insurgency rise in Sri Lanka from the ashes of the LTTE? Will surviving extremists from the war era come under the influence of foreign militants? If I tell you there are even odds of these things happening (which may or may not be true; these and all other examples in this post are things I’ve pulled out of my posterior orifice), you should hopefully infer and understand that the situation in question is hard to predict, potetially volatile, and with no really obvious long-term indicators.

On the other hand, what do you infer if I explicitly say there’s a 50% chance of these things happening? I sound indecisive, don’t I? And that indecisiveness hurts my credibility, right?

That’s how I and a lot of people I know tend to look at it, even if we don’t consciously realize we’re doing it.

A 100% chance = absolute confidence, a guarantee. But a 99% – or even 95% – chance kind of looks like someone is saying something they think you want to hear, but that they don’t really believe is true. (“There’s a 99% chance she’ll agree to go out with you, of course!” probably really means “Look, I like, respect, or work for you, and am telling you what you obviously, desperately want to hear, but that remaining 1% is my way of saving face when she laughs at your proposal, you fat slob.”) 50% = indecisiveness; 51% = uncertainty. Nobody likes a perceived weasel. Don’t ask me why.

A 70% chance sounds pretty good. A 75% chance sounds vaguely suspicious, for some reason. A 66% chance sounds even more suspicious. Non-round numbers, like a 73% chance, inspire a probably unwarranted degree of confidence. Adding decimal points – a 73.651% chance – suggests that science was involved somewhere; the interpretation of this varies from person to person.

Moderately low percent probability – 25-40% – instills distrust in everyone I know. (“Things with a thirty percent chance of occurrence seem to happen about three times in five,” one friend said.) I blame meteorologists for this. Actually, I blame meteorologists for a lot of public distrust of statistics; never before in the annals of human history have so few been so wrong about so many things so often with such precision. When it’s the middle of a drought and it doesn’t rain for a fortnight, but the forecast every day has a ten percent chance of rain, even relatively thick people start to notice that the numbers aren’t adding up as expected.

Call it a triumph of emotion over reason, I guess.

It’s now noon. Well, eleven fifty-eight, to be pedantic. Assuming the clock’s right. Either way, I infer that my lunch break is now over, so that’s it for today…

Published in: Geekiness, General | on July 28th, 2010| Comments Off on Statistics, Probability, and Inference

Both comments and pings are currently closed.

Comments are closed.