No Fancy Stats argument is actually about stats. A statistic is a number that (theoretically) represents reality, and you can’t argue about reality1. Mark Borowiecki has a 5-on-5 score-adjusted CF% of 44% this season. Can’t argue with that. Patrick Wiercioch has 3 points this season and Erik Karlsson’s got 51 points this season. Can’t argue with that.
Analytics arguments are actually (#ACTUALLY) dissecting what these numbers mean in terms of player evaluation and future performance. These are questions without easy answers2 and really more about one’s philosophy and biases than they are about actual mathematics. This is why participants in a #fancystats argument mostly end up sounding like 3rd year undergraduates trying to nail the 5% class participation mark. This isn’t math class, it’s philosophy class.3
All this to say I’m not here to talk about stats. I’m here to talk about models, which are like stats only worse. Let me explain…
Ed note: This post is about to get wild nerdy. I don’t know how to prevent this. Turns out one can’t talk about their personal philosophy of phenomenological modelling without sounding like a huge dork. So it goes.
In a perfect world, you would describe and make predictions about all physical phenomena by applying the prescribed laws. Physicists love doing this. Physicists write down some laws, solve some differential equations, and boom there’s General Relativity. Very few things ever work out this nicely. Most things worth studying contain too many moving parts to accommodate well-defined system behaviour. The real world is messy. This is where models come in.
A model takes inputs, does math to the inputs (to use the technical term), and then spits out an an output that looks hopefully looks like reality. A good model should help us make inferences about the relationship between Things. However, and this is very important to always bear in mind, just because a model looks like reality does not mean that it is necessarily a good stand-in for reality. This is the origin of the expression from the statistician George Box, “All models are wrong, some models are useful.”
Which brings me back to the intentionally inflammatory title of this post: your model sucks. It does. It is in the very nature of modelling that a model is an imperfect representation of reality. Therefore, if a model is to be taken seriously, I believe that the ways in which it is imperfect must be both qualitatively and quantitatively stated (and if your default position is to say “It’s just because of variance”, I will personally wish for you to be haunted until the end of your days by the ghost of Ludwig Boltzmann.)
There is another philosophical question that must be answered, which is “What is this model for?”. Is it meant to be a descriptive model (and if so, why is it necessarily better than examining raw inputs?), or is it a predictive/evaluative model (and if so, just how predictive is it?). There’s a couple of models floating around out here, and it’s not always clear what supposed to be for.
Let’s look at the much-ballyhooed dCorsi. From Stephen Burtch’s post, “dCorsi represents the unexplained residual portion of Corsi results observed for a given skater in a given season.” which is to say it’s the difference between The Fancy Model and Reality. Even if dCorsi is repeatable (its year-over-year R-squared is about 0.15), all that would really mean is that the model is wrong in some consistent ways, which I would find worrying if it was my job to apply the model. I would rather just use dCorsi as a way to quantify the error bars on the model outputs. I think it’s difficult to properly use something like dCorsi as an evaluative tool when it is literally just an expression of what you don’t know.
Then there is this:
In general I feel like weird results such as this, where Brad Marchand has a Goals Above Replacement per 60 that is 50% higher than Patrick Kane, or where John Tavares and Jack Eichel have worse dCorsis than Zac Rinaldo really say much more about the model in question than they do about the player being modeled. It’s tough for me to read this post without coming away with the impression that the values from this Expected Goal model should have some big goddamn error bars on them. Merely posting something that basically says “Aaaaaaay, look how much better Brad Marchand is than Patrick Kane.” is slightly absurd because Brad Marchand is not a better hockey player than Patrick Kane. If anything, this tweet is best understood as an illustration of how much work on these types of models still need to be done4.
Not every deep truth about sports has to be couched in some sort of Gladwellian counter-intuition. Sometimes your model just sucks. I need to know why and by how much if I’m ever going to use it.
1. Ok just work with me on this one.↩
2. Don’t @ me.↩
3. Good example: Do secondary assists matter? Answer: it depends! Speaking of secondary assists, we here at Welcome to your Karlsson Years dot com would like to bestow the Lifetime Achievement Award in Petty Hating to Tyler Dellow for his 2012 piece (which sadly no longer exists on the internet) in which he examined every single one of Erik Karlsson’s assists in an attempt to de-legitimize EK’s point totals. You did it, boo! (You can read Travis Yost’s response here.) ↩
4. I believe Zack Lowe’s amazing piece on the Toronto Raptors’ player tracking department is an excellent indication of how much more data (i.e. a ludicrous amount of data) modelers will need before useful models can be created for hockey. Until then, I’ll settle for some big ol’ error bars on this stuff.↩