November 02, 2005
Three-Toed Sloth on Darwin and Einstein and Email
Cosma Shalizi weighs in on Albert-László Barabási's take on correspondence in email, and other forms, follow a power law distribution because of a queuing process.
[T]his is not true; the apparent power law is merely an artifact of a bad analysis of the data, which which is immensely better described by a log-normal distribution. . .
As every school-child knows (at least, these school-children do!), adding together many independent random variables, each of which makes a small contribution to the over-all result, generally gives you a Gaussian or normal distribution (unless the contributing variables are, themselves, kind of pathological). This fact is the central limit theorem.
What happens if the inputs are multiplied together, rather than added? Well, take the logarithm: log(XY) = log(X) + log(Y). The logarithm of the product will be the sum of the logarithms of the inputs. The latter will still be independent, so the logarithm of the output will be normally distributed. Undoing the log gives what's imaginative called the log-normal distribution. Log-normals are very common, for the same reasons that normals are. Unlike normals, they are very easy to mistake for power law distributions, especially if your knowledge of statistics is as limited as most theoretical physicists'. (The distribution of links to weblogs, for instance, is much better fit by a log-normal than a power law, as we've seen.)
Update: Cosma points to the original paper in which Stouffer, Malmgren and Amaral properly reanalyzed the data. His post largely reports on their work.
Posted by Robin Varghese at 12:27 PM | Permalink
TrackBack
TrackBack URL for this entry: http://www.typepad.com/services/trackback/6a00d8341c562c53ef00d8345e7c5053ef
Listed below are links to weblogs that reference Three-Toed Sloth on Darwin and Einstein and Email:






















Comments
While I appreciate the link, it's only fair to point out that my post just reports the work done by Stoffer, Malmgren and Amaral, who went to the trouble of actually analyzing the data properly.
Posted by: Cosma | Nov 2, 2005 1:07:51 PM
Post a comment