I just came across this nifty applet that lets you look at the popularity of every number between 0 and 100,000 - based on the results of a prominent search engine.
Not surprisingly, we live most of our lives happily ensconced between 0 and 100.
I don't think there is anything earth shattering about this - it's pretty much common sense. But it's nice to see it displayed - that 'aesthetic' value of information again.
It also reminded me of a use of this type of distribution that is not commonly known. Apparently you can tell if someone is cheating on their tax forms by looking at the distribution of integers they use. If it's more perfectly random than it should be (too many numbers with the same frequency of occurrence), they might be cheating. Normal tax returns should have skewed distributions - more 1s than 7s for instance. When we cheat, we tend to be a bit too random - some important advice there.
I can't find a link for this though, but I know this type of phenomena has a name... need to do some digging.
PS: I don't condone tax fraud. I love the IRS. No, really, I do.
PPS: Isabel came to the rescue and just commented that the phenomena is call Benford's Law - thanks Isabel, I needed that. Was driving me crazy.
Tuesday, December 4, 2007
We live most of our lives between 0 and 100
Posted by Paul Soldera at 2:53 PM
Labels: data, design, information, numbers
Subscribe to:
Post Comments (Atom)
4 comments:
didn't levitt talk about that in freakonomics...? the chapter about cheating teachers and sumo wrestlers.
Just gave it a short skim - hate to think I would have read it in something so recent and forgotten!
His examples are similar, but I don't think he references exactly what I am trying to remember. I had it demonstrated to me about 5 years ago and there was a definite name for it.
Driving me crazy now!
The phenomenon you're thinking of is called "Benford's Law".
OMG, thank you Isabel! That was driving me crazy. Benford's Law is indeed what it is called.
Post a Comment