The Freaky World Of Math: The Statistical Anomaly

“There are lies, damned lies, and statistics.” How is that for starting off with a cliché one-liner!? However, it is true that statistics can be quite misleading or difficult to interpret at times. Enter Simpson's Paradox. I'm going to demonstrate two interesting real-world examples of this.

	Applicants	Admitted
Men	8442	44%
Women	4321	35%

The above numbers are grad school admissions at UC Berkeley from the fall of 1973. It sure seems that, compared to women, men were more likely to be admitted. Looking at these figures, would you accuse them of gender bias? Well, some people did, and sued the university!

So Berkeley decided to take a closer look at the numbers. Admissions are per department, so they wanted to find out which specific departments were guilty of a significant bias against women. Guess what... none of them were.

Here's a snippet of the admission rate of the top six departments so you can see for yourself:

Department	Men		Women
Department	Applicants	Admitted	Applicants	Admitted
A	825	62%	108	82%
B	560	63%	25	68%
C	325	37%	593	34%
D	417	33%	375	35%
E	191	28%	393	24%
F	272	6%	341	7%

Look at that! If anything, what we see here is a “small but statistically significant bias in favor of women”. That's a quote from a research paper.

It turns out that women applied more to competitive departments that had lower admission rates in general. It skews the numbers when presented in different ways. See how easy it is to be misdirected?

Another real-life example then. This is from a medical study, and it compares two treatments for kidney stones. It shows the success rate of the treatment in general, and also when making a distinction between small and large stones.
So tell me, which treatment is more effective?

	Small stones		Large stones		Both (total)
Treatment A	81 / 87	93%	192 / 263	73%	273 / 350	78%
Treatment B	234 / 270	87%	55 / 80	69%	289 / 350	83%

For small stones, the table says treatment A is best. For large stones, it says the same. But without making this distinction in stone size, it says treatment B is best. Huh?

For each stone size, the group size of the two treatments being compared are very different (87 versus 270 for treatment A, 263 versus 80 for treatment B). This causes problems when combining the data. See, doctors tend to give the traditional treatment A to the more severe cases with large stones, while milder cases with small stones are more often treated with the less-invasive treatment B. Now you've got two larger groups that erroneously dominate the totals over two smaller groups. Since cases with small stones have a better success rate in general, it makes treatment B look better than it actually is.

With group sizes such as this you shouldn't combine (‘aggregate’) the data. Therefore the proper conclusion is that treatment A is best!

The Freaky World Of Math

Tuesday, December 13, 2011

The Statistical Anomaly

No comments:

Post a Comment