News Archive
PhpRiot Newsletter
Your Email Address:

More information

Ian Barber's Blog: Benford's Law

Note: This article was originally published at PHPDeveloper on 5 April 2011.

In a recent post to his blog Ian Barber looks at applying Benford's Law in PHP to determine if the dataset you're working with is "real" or not.

Benfords Law is not an exciting new John Nettles based detective show, but an interesting observation about the distribution of the first digit in sets of numbers originating from various processes. It says, roughly, that in a big collection of data you should expect to see a number starting with 1 about 30% of the time, but starting with 9 only about 5% of the time.

He pulls data from the site to illustrate and includes a simple PHP script to run through the data looking scoring it with a "Benford" rating. He plots these on a graph along side the data to show the (almost exact) match between the data and the Benford numbers. You can find more details on the law on Wikipedia.