If my father had tailored his parenting style with the goal of giving me useful anecdotes later in life he would have said “you can’t make a diamond piling shit on it.” Instead I have almost no useful anecdotes about him, though he did say shit a lot.
I recently did a small examination of sophomore slumps in QBs and one of the more persistent criticisms was that I didn’t use enough stats to paint a full picture of the seasons. I guess if he had really wanted to be useful my dad would have said “If you make a painting out of one hundred kinds of shit it’s still just shit on canvass.” He didn’t. But I’m here to say it. Shit on canvass probably smells and it probably doesn’t look good.
When we evaluate this game we should endeavour to do it with best practices and good intentions. I’m not writing this to prove one stat is better than another but to get people thinking about how less can be more (it can’t, but whatever) and to encourage more introspection in terms of the use of stats.
There are two ways of using stats: prospectively and retrospectively. If I say Russell Wilson was amazing last year because he was number one in my heart that is a retrospective use of the heart ranking stat. If I say Andy Dalton is going to suck next year because he is over 6’ that is a prospective use of the very stable height stat.
In turn, there are two kind of stats: descriptive and predictive. Descriptive stats attempt to describe a player’s performance whereas predictive stats attempt to describe a player’s talent level. As you might imagine all stats are actually used in both contexts but some are better tailored to one use or the other. For example any time a stat that includes a clutch or leverage factor is used predictively Tony Romo crys.
All of this is pretty straight forward, easy even. Except for the little hitch that there is no perfect or even good predictive stat in football. If there were even a half good one we wouldn’t be making fun of the Jets so much because Mark Sanchez would have been a Cardinal last season. Also there wouldn’t be people arguing that Christian Ponder was even okay last season.
So, what do we do when we want to determine a player’s talent level as opposed to performance level?
First, discard all counting stats. Counting stats are good for records and trivia but that’s all a stat that isn’t per snap, attempt, drive, or game is good for. I’m all for reducing needless complexity in sporting stats but if division is needlessly complex for you then it’s probably not okay for you to surf a website with swearing. Go tell your parent(s) the bad writer said “shit” instead.
As a first glance stat for QBs I like yards per attempt for predictive purposes and ANY/A (includes sacks, touchdowns [+ 20 yards], and interceptions [-45 yards]) for performance purposes. After that I’ll look at completion percentage and sack rate . If the QB has been starting in the league for five or so years I may also look at interception rate and TD rate - but it’s unlikely.
(Quick aside, people tend to blame lines too much for the sack stat - sack rates actually stay pretty stable for QBs outside of exceptional circumstances)
I don’t use ANY/A for predictive purposes because it adds noise rather than subtracting it. Neither passing touchdowns nor interceptions are stable enough to reflect skill across even a full season of data. In particular, interceptions are a problem. This is why when you see someone using single season interception data without going to film you should run, not walk, away.
This is all to say that if I listed QBs by their NY/A and then put a box next to that with each QB’s interception totals I would be making the table less reflective of the individual QBs’ talent levels. This is true for passer rating, QBR, total yards, yards per game, and TDs. Each one is either too noisy and/or already accounted for better by NY/A. It would all be fine if people were good at taking the drawbacks of each stat into account but we’re not.
We’re great at finding patterns and meaning where none exist. Letting ourselves see all the available data when most of it is crap is a great way to make sure our predictive analysis is crap too.
So, go forth and use good stats.