Saturday, February 22, 2014

Intuition Behind the Birthday Bets

The "birthday bets" are a standard example in statistics classes. How many people must be in a room before it is more likely than not that two of them were born during the same month? Or in a more complex form, how many people must be in a room to make it more likely than not that two of them share the same birthday?

The misguided intro-student logic usually goes something like this. There are 12 months in a year. So to have more than a 50% chance of two people sharing a birth month, I need 7 people in the room (that is, 50% of 12 plus one more). Or there are 365 days in a year. So to have more than a 50% chance of two people sharing a specific birthdate, we need 183 people in the room. In a short article in Scientific American, David Hand explains the math behind the 365-day birthday bets.

Hand argues that the common fallacy in thinking about these bets is that people think about how many people it would take to share the same birth month or birthday with them. Thus, I think about how many people would need to be in the room to share my birth month, or my birth date. But that's not the actual question being asked. The question is about whether any two people in the room share the same birth month or the same birth date.

The math for the birth month problem looks like this. The first person is born in a certain month. For the second person added to the room, the chances are 11/12 that the two people do not share a birth month. For the third person added to the room, the chances are 11/12 x 10/12 that all three of the people do not share a birth month. For the fourth person added to a room, the chances are 11/12 x 10/12 x 9/12 that all four of the people do not share a birth month. And for the fifth person added to the room, the chances are 11/12 x 10/12 x 9/12 x 8/12 that none of the five share a birth month. This multiplies to about 38%, which means that in a room with five people, there is a 62% chance that two of them will share a birth month.

Applying the same logic to the birthday problem, it turns out that when you have a room with 23 people, the probability is greater than 50% that two of them will share a birthday.

I've come up with a mental image or metaphor that seems to help in explaining the intuition behind this result. Think of the birth months, or the birthdays, as written on squares on a wall. Now blindfold a person with very bad aim, and have them randomly throw a ball dipped in paint at the wall, so that it marks where it hits The question becomes: If a wall has 12 squares, how many random throws will be needed before there is a greater than 50% chance of hitting the same square twice?

The point here is that after you have hit the wall once, there is one chance in 12 of hitting the same square with a second throw. If that second throw hits a previously untouched square, then the third throw has one chance in six (that is, 2/12) of hitting a marked square. If the third throw hits a previously untouched square, then the fourth throw has one chance in four (that is, 3/12) of hitting a marked square. And if the fourth throw hits a previously untouched square, then the fifth throw has one chance in three (4/12) of hitting a previously touched square.

The metaphor helps in understanding the problem as a sequence of events. It also clarifies that the question is not how many additions it takes to match where the first throw (or the birth of the first person entering the room), but whether any two match. It also helps in understanding that if you have a reasonably sequence of events, even if none of the events individually have a greater than 50% chance of happening, it can still be likely that during the sequence the event will actually happen.

For example, when randomly throwing paint-dipped balls at a wall with 365 squares, think about a situation where you have thrown 18 balls without a match, so that approximately 5% of the wall is now covered. The next throw has about a 5% chance of matching a previous hit, as does the next throw, as does the next throw, as does the next throw. Taken together, all those roughly 5% chances one after another mean that you have a greater than 50% chance of matching a previous hit fairly soon--certainly well before you get up to 183 throws!