A Coruña | 243,870 |

Ávila | 58,358 |

Barcelona | 1,604,555 |

Bilbao | 345,141 |

Cáceres | 95,617 |

Ciudad Real | 74,427 |

Córdoba | 327,362 |

Cuenca | 55,428 |

Donostia | 186,095 |

Girona | 97,586 |

Granada | 235,800 |

Guadalajara | 83,391 |

Huelva | 146,318 |

Huesca | 52,239 |

Lugo | 98,134 |

Madrid | 3,141,991 |

Málaga | 569,130 |

Murcia | 439,889 |

Ourense | 106,231 |

Oviedo | 221,870 |

For example, take a list of population counts for each of the more than 8000 cities in Spain. Like the one on the left. Without actually seeing that whole list, approximately what percentage of those numbers do you expect to start with the digit

**one**?

You could reason that since the first digit could be any from 1 to 9, then a number will start with a specific digit about one in nine times as well. Or about ~11%. You know, on average.

That seems a reasonable and logical guess, and yet it turns out that almost

**one third**of the numbers on this list start with the digit ‘1’!

Check out that link. That site shows something else too. It doesn't just apply to this particular list of numbers. Nor does it only apply to lists of population counts. There are many,

*many*numerical data sets out there where this Law — also called Benford's Law — applies.

Socio-economic data. Stock prices. The lengths of rivers in miles. The lengths of rivers in kilometers! Street addresses. Constants in physics. Birth rates. Death rates. The sizes of the files on your computer. It doesn't apply to everything, but it sure applies to a lot.

Benford's Law actually says something about the occurrence of

*all*digits in such data sets, not just the 1:

Distribution of leading digits in data sets | ||
---|---|---|

1 | 30.1% | |

2 | 17.6% | |

3 | 12.5% | |

4 | 9.7% | |

5 | 7.9% | |

6 | 6.7% | |

7 | 5.8% | |

8 | 5.1% | |

9 | 4.6% |

Realizing all this is actually good for something too.

Like, if you sell a lot of house numbers such as these on the right, I guess you might want to stock up on the lower digits? Ehm...

^{via}

A more interesting application is that Benford's Law can be used for fraud detection, for example in accounting figures. People tampering with numbers tend to try and give them a nice and even distribution. That looks the most innocuous right? Well, not if put besides Benford's Law. So it is being used in forensic accountancy too, where it is admissible as evidence in court. The math must really check out!

Stepping it up: Can the Law be used for predicting a financial crisis?

Well... one can surely speculate:

*“Greece's public accounts deviated significantly from the distribution of values indicated by Benford's Law just before joining the Euro. It has been suggested that Greece modified their numbers in order to remain compliant with the Maastrict Treaty.”*

^{source}

Huh.

**Bonus:**Do you know the following sequence of numbers?

1 | 1 | 2 | 3 | 5 | 8 | 13 | 21 | 34 | 55 | 89 | 144 | 233 | 377 | 610 | 987 | 1597 | 2584 | 4181 | 6765 | … |

Another sequence are the factorials. The factorial of a number is the product of that number and all smaller whole numbers. It is denoted with an exclamation mark. So 4! = 4 × 3 × 2 × 1 = 24.

The list of factorials is as follows, and it too goes on forever:

1 | 2 | 6 | 24 | 120 | 720 | 5040 | 40320 | 362880 | 3628800 | 39916800 | 479001600 | 6227020800 | … |

And then there's powers of 2:

1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | 1024 | 2048 | 4096 | 8192 | 16384 | 32768 | 65536 | … |

The sequences mentioned above have something in common.

Their leading digits follow the distribution given by Benford's Law

**exactly**.

So once again... Huh.

## No comments:

## Post a Comment