Three days before the most recent presidential election, two forecasters got into a statistics fight on Twitter. One of them was Nate Silver, founder of the prediction website FiveThirtyEight and author of an award-winning book on probability. The other was Ryan Grim, Washington bureau chief for The Huffington Post.

At the time, the FiveThirtyEight models put the odds of Hillary Clinton winning at about 65%, while The Huffington Post put the odds at about 98%.

Shortly after the election, I was discussing this argument with a friend. She asked me: what were they arguing about, exactly? Both people thought Hillary would probably win, so what do those different numbers really *mean*?

Here’s a first try at what those numbers mean. Instead of talking about elections, let’s talk about coin flips.

What does it mean to say that, when you flip a quarter, it has a 50% chance of turning up heads?

You might have an answer in mind: it means that *if you flip the coin a lot of times, you would expect it to turn up heads on about 50% of those flips*.

So, for example, if you flip the coin 100 times, about 50 would turn up heads. If you flip it 6000 times, you would get about 3000 heads. That’s one explanation for what “50% chance” means.

This answer works well for certain kinds of situations: probabilities for certain things, like rolling dice and playing cards, make sense with this interpretation. In an earlier post, Lessons from Go Fish, I talked a lot about how you can use this kind of probability when you’re drawing cards from a deck.

Let’s try putting the election into this interpretation. In that case, Silver is saying that if you had the election 100 times, you would expect Hillary to win in about 65 of those. And Grim is saying that Hillary would win in about 98 of them.

But this interpretation has issues.* In real life, you never have the same election twice, let alone 100 times. If you could magically have the same election happen a second time, with every detail exactly the same as it was the first time, people would also vote the same way that they did the first time. You wouldn’t end up 65 of one result and 35 of the other.

In other words, the problem with explaining the election this way – that Hillary would win about 65 out of every 100 times – is that elections are things that inherently *don’t* happen 100 times. Each election happens just once. Every factor that influences the election, from a candidate’s posture at a debate to a hailstorm that stops your grandpa from going out to vote, can only happen once.

That point, that elections inherently only happen once, goes together with another important point. The factors that influence an election aren’t really random in the way we think of coin flips as random. Of course there is causality in the way a coin flips (the air pressure, the Earth’s gravity, the tension in my hand, and so on), but those factors are too complicated to get a handle on. Coin flips seem truly random. Elections do not.

So why use probability to discuss elections at all?

My understanding is essentially this. There are causative factors that affect an election, and we feel that we roughly understand some of them. Yet when it comes to predicting something like a vote, it’s impossible to know or keep track of every single relevant detail, or know exactly what effects those little things will have. And *that’s *where probability comes in.

Which brings us to another way to understand probability. I’ll discuss that in the next post.

**Thanks for pointing that out, SB. Some of this discussion is drawn from http://www.deeplearningbook.org/contents/prob.html around page 55.*