## December 26, 2008

### (Effect) size matters

Nathan Yau is not Edward Tufte. Yau is a doctoral student in statistics. Tufte is a Yale professor emeritus. Yau's list of his 5 best data visualization projects of 2008 has a common missing element (from four of the listed projects) that E.T. would pull tufts of hair out over: the images have no quantification. To Tufte, that is a cardinal sin, along with the "chartjunk" that infects so many graphs in *USA Today* and other newspapers.

I am generally on the side of Tufte on this issue: unless you're a topologist, quantity matters and units matter. A common fallacy in manuscripts (and sometimes published articles and books) is the confusion between statistical significance and practical meaning. But if you are working with a sample size of 50,000 or more (common with a large epidemiological study or census microdata extracts), it is hard for many relationships **not** to be statistically significant. But whether the relationship is meaningful depends on the size of the relationship.

And here, the units matter! If you know that the multiple-regression coefficient between income and achievement is 1.5, that may or may not be notable. If you're measuring income in thousands of dollars and achievement in scale score points when the range is 0-1000 and the standard deviation is 150, that's a meaningless relationship (going up 15 points, or 0.10 of a standard deviation, when the income increases by $100,000). If you're measuring income by natural log and achievement in standard-deviation units, that's a substantial relationship (essentially moving a standard deviation up or down when the income doubles or is halved).

In part stemming from the literature on meta-analysis, it is becoming more common for individual studies to identify effect sizes. While I still want to have a sense of concrete relationships, pushing authors to look at quantitifed relationships in perspective is always good. The same should be true for "data visualization." Quantify, folks!

(For the record, I don't think Tufte is infallible. Far from it.)

Tags: effect size, statistics, visualization

Posted in The academic life on December 26, 2008 11:03 AM |