top of page
  • Writer's pictureJackson Curtis

Can you interpret this simple linear regression?



I've been thinking about this blog post all day. I highly recommend reading it. What I find so fascinating about this example is that I am convinced I could find a similar dataset, walk into a boardroom and present either analysis, and leave with the executives convinced I was right. I think that's a scary thing for analytics. If two analysts can look at a two variable dataset, run a simple linear regression, and come away with polar opposite recommendations, there's been some kind of breakdown between 'the statistics' (the formulas/methods) and the 'decision science' (how we act on data).


I'll be very interested to see Andrew's follow-up when he shares his conclusion, but for what it's worth, here are my thoughts so far:

  1. The first graph tells you that a person with a market value of less than 25 million will have an expected EPA of less than 11.

  2. The second graph tells us on average people with EPAs of 11 are making $10 million a year.

The fact that 1 & 2 can be simultaneously true is counterintuitive, but if you go through the logic you'll find that they are. In my opinion, the real problem is that neither of these interpretations naturally map to "Is someone overpaid or underpaid?" I think to address the "overpaid/underpaid" question, you have to calculate a person's expected future performance. The problem seems to want you to assume that Rodger's future EPA is 11 with no uncertainty. However, all the other players' data from which the the regressions are made either (1) assume uncertainty in the future EPA or (2) use other variables to value the players. This would be the only reason for the low correlation between money and EPA. Thus, it seems inconsistent to use this data at all to inform Rodger's market value. If you had to twist my arm and make me choose one, I would choose B because A's argument that players with past EPA of 11 are worth $25 million is wrong because we can predictably expect regression to the mean in Aaron Rodger's future performance.


What do you think? Are these graphs useful at all in deciding if a player is under or over valued? Let me know in the comments.

1 comment

1 Comment


Jackson Curtis
Jackson Curtis
Dec 13, 2021

Andrew Gelman finally got around to posting his thoughts. I was comforted that he didn't find a strong, straightforward answer to this either: https://statmodeling.stat.columbia.edu/2021/12/13/the-nfl-regression-puzzle-and-my-discussion-of-possible-solutions/

Like
Post: Blog2_Post
bottom of page