Lying with Statistics in Football
In the aftermath of the Super Bowl, some of you fans may be dreading the next six months. To kick off this football drought, I'd like to highlight this article, which was featured on Yahoo yesterday. The article says that Saints quarterback Drew Brees should hope to lose the coin toss at the start of the game, because in the past 43 Super Bowls, the team that won the coin toss had only won 20 times.
An unlucky coin? Unlikely.
Um...what? Who cares? While 20/43 is slightly less than the expected 50%, this difference is not even close to being statistically significant. Actually, the fact that this ratio is only 1 1/2 games shy of the mean is pretty good. Matt Springer has posted an article that discusses why we shouldn't really care about this difference.
Of course, the sample size is naturally restricted by the small number of Super Bowls, but if the author (Mark Pesavento) had really been interested in the question of whether or not the coin toss is correlated with the winner in a football game, he could've easily collected data over a couple of seasons and obtained an answer to the question. At the very least, he could've owned up to the fact that his analysis is worthless, but instead, to the critics he offers only the following rebuttal: "because of the small sample size, some statisticians argue that the win-loss record of coin-toss winners is statistically insignificant."
This is completely disingenuous, because it suggests that there would be a debate among statisticians about the significance in the data Pesavento uses, when no such debate exists. Anyone with even a rudimentary background in statistics would understand that the sample size here would be too small to draw the conclusion he draws.
Moreover, Pesavento falls for one of the most common traps in statistics: mistaking correlation for causation. Even if the data was much stronger in indicating that the coin toss winner is at a disadvantage, this would not imply that Brees should hope to lose the toss. A correlation between these two effects does not imply a causal relationship between the two. I feel like I've discussed this before, but just in case, here's a thorough discussion of this misconception.
Here this point is moot, since we don't even have a correlation. I thought no one would need to point out that "No correlation does not imply causation," but apparently we do.
Thankfully, most of the comments on Pesavento's post are scathing in regards to his methods. But that's cold comfort in light of the fact that the article was deemed fit for posting on the front page of Yahoo.
Psst ... did you know I have a brand new website full of interactive stories? You can check it out here!
comments powered by Disqus