A lot of the guys around thse parts use Football Outsiders' DVOA as their standard statistic for talking about football teams. I'm suspicious of DVOA, and this post will talk about why I have that suspicion, my preferred alternatives, and what DVOA does well.
First, a slight digression: What do we use statistics for? We use them to answer questions. In fact, we should use statistics to answer specific questions. Which team has a better passing game? Who's good at stopping the run? When should we go for it on fourth down? These are three important football-related questions, and each of them should call for a different statistic (or multiple stats).
And this doesn't mean that stats can answer everything. Statistical information can point us in the right direction but football is a complicated, chaotic undertaking. We should always take statistical information with a grain of salt. One reason I like Football Outsiders is that they have a healthy appreciation for the limitations of what their statistics can tell us. Sometimes I don't think they're skeptical enough of DVOA, but that's a different story.
So why don't I trust DVOA? Two reasons. One, I can't derive it myself. Two, I don't know what DVOA is trying to measure. Take a detour and go read Football Outsiders' Methods page. Did you find that illuminating or confusing?
FO doesn't try to explain what DVOA is trying to measure. They tell us that it's a per-play measure of efficiency, but let's look back at the derivation. There's a lot of talk about success points, and how they're earned, but how do those success points correlate with real, on-field football victory? That's absent from FO's discussion of their number one metric.
Moreover, is DVOA a retrospective or a predictive statistic? This is a key distinction in the world of sports stats. Jet on over to baseball, and you can see the clear difference between ERA, FIP, and xFIP. ERA tells you what happened on the field. FIP tells you the pitcher's part of that contribution. xFIP tells you something about how a pitcher is likely to perform in the future. Football Outsiders bandies DVOA about like it's an answer to all of these questions. They use it like it explains the past and predicts future success. We should be skeptical of it on this basis alone.
And can you, an outsider (no pun intended), re-create what the FO guys have built? If I can't compute it myself, then I don't know what its weaknesses and strengths are.
My objections stem back to one thing: "Why?" Why is DVOA built this way, and how does it purport to measure football success? This is the key thing that is missing from DVOA, and the lack of transparency in computation doesn't give me much faith in the system.
So where can we go for alternatives? A good place to start is Brian Burke's Advanced NFL Stats. Burke uses four main statistics to discuss the game of football: WPA (win probability added), EPA (expected points added), ANYA (adjusted net yards per attempt) and SR% (success rate).
Now look at Burke's statistical glossary. Could you derive those statistics? (Okay, could you derive them with a win probability chart and an EPA chart?) Yes. Absolutely. There's a transparency to Burke's derivations that DVOA lacks. You can get into the underlying math. This gives you an appreciation of what a statistic is trying to do, and what its limitations are.
Moreover, when you look into Burke's descriptions of the uses of WPA and EPA you see something that FO lacks: a clear appreciation of what each statistic is useful for, and what it doesn't tell us. WPA is obviously retrospective, and tells us what happened in a game. It's also useful for calculating game decisions like when to go for it on fourth down. EPA is a superior measure of in-game efficiency. ANYA is an excellent shorthand for the efficiency of the passing game. These things have clear meanings that immediately connect back to the game played on the field.
And, to take one example, it's easy to talk to people about adjusted net yards per attempt. "Take a passer's yardage, subtract sack yards, give a penalty for interceptions and then divide by attempts plus sacks." Any meathead with a passing connection to football will be able to grasp what ANYA tells us, and why it's superior to regular yards per attempt. EPA and WPA are a bit more complicated, but also connect back to the game in obvious ways. DVOA does not do that.
So now that I've spent 750 words running DVOA down, what do the Football Outsiders guys do well? There are two things I appreciate from the FO guys that Burke doesn't provide. (Statistically. FO's got ANS beat on the scouting side all to heck. If you aren't reading Word of Muth every week you're misssing out.)
First, FO opponent-adjusts their numbers. ANS does not. The football season is short and the FO guys are right about one thing: quality of opponent matters. Now, that adds a lot of complexity to their system and makes it impossible to derive. But it's also information present that ANS' numbers lack.
Second, FO understands and applies the concept of "replacement level." (To be fair, I think Burke understands replacement level just fine, but doesn't choose to try to apply it.) Replacement level is a useful theoretical fiction in sports, and I'm glad that FO is at least making a stab at it. I want someone to talk about this, and I appreciate the work that FO does in this space. I think DYAR is kind of goofy (again, derivation and relational problems), but it does give positive credit for average performance.
And one final bit: Don't lock in one statistic. One fun thing to do on Tuesday mornings is compare FO's Quick Reads column to Burke's lists of the best quarterbacks, receivers, and running backs of the week. For one, it's been fun to see Russell Wilson's name riding high on the QB lists for the last few weeks. For another, the differences between those tables tell us as much as the similarities. DVOA and DYAR make a nice check on Burke's work and vice-versa.
But at the end of the day, when I want to talk about the quality of a football team in objective terms I head back to Advanced NFL Stats and see the work that Brian Burke is doing. I read FO religiously, and I take them seriously, but I understand what Burke is trying to accomplish with his statistics. That gives me more faith in his numbers.