Wednesday, November 02, 2005
Three-Toed Sloth on Darwin and Einstein and Email
[T]his is not true; the apparent power law is merely an artifact of a bad analysis of the data, which which is immensely better described by a log-normal distribution. . .
As every school-child knows (at least, these school-children do!), adding together many independent random variables, each of which makes a small contribution to the over-all result, generally gives you a Gaussian or normal distribution (unless the contributing variables are, themselves, kind of pathological). This fact is the central limit theorem.
What happens if the inputs are multiplied together, rather than added? Well, take the logarithm: log(XY) = log(X) + log(Y). The logarithm of the product will be the sum of the logarithms of the inputs. The latter will still be independent, so the logarithm of the output will be normally distributed. Undoing the log gives what's imaginative called the log-normal distribution. Log-normals are very common, for the same reasons that normals are. Unlike normals, they are very easy to mistake for power law distributions, especially if your knowledge of statistics is as limited as most theoretical physicists'. (The distribution of links to weblogs, for instance, is much better fit by a log-normal than a power law, as we've seen.)
Update: Cosma points to the original paper in which Stouffer, Malmgren and Amaral properly reanalyzed the data. His post largely reports on their work.
Posted by Robin Varghese at 12:27 PM | Permalink
TrackBack URL for this entry: http://www.typepad.com/services/trackback/6a00d8341c562c53ef00d8345e7c5053ef
Listed below are links to weblogs that reference Three-Toed Sloth on Darwin and Einstein and Email: