What Good is the Eye Test?

philosophical santa - Joe Nicholson-USA TODAY Sports

"There are various eyes. Even the Sphinx has eyes: and as a result there are various truths, and as a result there is no truth." - Frederich Nietszche

The eye test is a crucial tool for football analysis. First, our eyes are the primary means by which we acquire information about football. We watch the damn games. I can tell you that Russ had a 147.9 passer rating in the win over Pittsburgh, or that he completed 70% of his passes, or that he had 11.5 yards/attempt, and you can be impressed by those numbers, certainly. But it doesn't compare to the feeling of being astonished when actually watching it. When ETIII streaks across the field to snag an interception, nobody thinks, "Damn, that's a X.YZ sigma athlete". They just know, "damn, that was an athletic play." Numbers will always struggle to convey the same awe as the actual performance. But that is exactly why they are so important.

This is a politics, religion, chatspeak, no wait, a football blog (sometimes I forget). Discussing football might not be our raison d'être, but its our reason to be here. But the genius of Field Gulls is the commenting system that allows us little folk to discuss the game, rather than just read and enjoy the tid-bits sent down from Kenneth and company. Some might disagree with me, but I think the purpose of such discussions is to elucidate truths, or if not truths, at least information, about the game of football. It doesn't mean that every discussion is going to end with some discovery of an interesting facet of football, but discussions should move towards understanding, otherwise they are just a football-focused form of intellectual masturbation (and C&C has enough of that ;)).

So, with that in mind, what good is the eye test? Well, the eye test is good for hypothesizing. A great example of this is Hazbro's work on our redzone defense. He saw that our defense gave up yards, but few points, and explained this qualitatively (when he describes the calculus of speed vs length vs overall size) and then backs up his argument with some stats (redzone scoring statistics). A short aside, anybody thinking about writing a fanpost should read that piece once or twice, as it's a nearly perfect example of a fanpost. It's got a hypothesis, it describes a theory, and then uses data to back it up. It takes information that anybody can find and synthesizes it together. (Come back Hazbro). This piece would be much diminished without the eye test to tell Hazbro about the speed and length of the Seahawks defense.

The eye test is also great for describing small collections of plays. For example, this is the information that has on Wilson's two successful two-point conversion plays.


These plays go down nearly identical in the PFR play by play, but, uh, they were... not the same, to say the least. The eye test is great at describing the differences between the plays. The stats we have at our disposal (for free, at least) don't do a great job at that kind of discerning. So, if one was to have a discussion about the relative merits of a playcall (for example), the eye test would have to be a part of the analysis. Every stat, freely available or not, is not going to have even 30% of the information necessary to make even preliminary judgement. You have to see the formations, see the personnel, see the blocking schemes and routes run, see where the failures are. Stats can't tell you this. They distill the motion of 22 bodies, the planning and prep of dozens of coaches, to a single line.


It lacks the grandeur of "Beast Quake I", doesn't it?

"There are lies, damn lies, and statistics the eye test" - Me

But the very thing that makes the eye test great for looking at small collections of plays, makes it pretty bad at looking at whole seasons (or more) and coming to sweeping conclusions. Take a gander at this study from the 70's, "Reconstruction of Automobile Destruction: an Example of the Interaction Between Language and Memory". The researchers showed 5 groups of people the same video of a traffic accident and then asked each group, "About how fast were the cars going when they () into each other?" For each group, the '()' was replaced by either "hit, "smashed", "collided", "bumped", or "contacted". The groups that had 'smashed' in the questionnaire estimated the speed was nearly 41 MPH, while the group that had 'contacted' in the questionnaire estimated the speed was nearly 32 MPH. This is a significant gap, but its not the only issue. The researchers also showed a car accident video to 150 students, divided up into 3 groups. One group was asked "About how fast were the cars going when they smashed into each other?", another group was not asked about speed, and the third group was asked "About how fast were the cars going when they hit into each other?". A week later, the subjects were asked if they saw any broken glass in the accident (there was no broken glass in the accident). Of the students who were asked the 'smashed' question, 16/50 of them reported seeing broken glass, compared to 7/50 students who had 'hit' and 6/50 who weren't asked about speed. This illustrates a couple of points.

First, with just a week's separation from the viewing and the 'broken glass' question, over 10 percent of the non-primed subjects remembered wrong. Therefore, using the eye test to make conclusions from games last seen several months ago seems (and this is an assumption) like it would have even more error. Second, language has an effect on how we reconstruct our memories, both in the short term and the longer term. If you take a generic playcall and ask somebody, "How stupid was this playcall?" vs "How interesting was this playcall?", the responses you will get could very well be different. Another assumption, but it seems to be not all that much of a stretch to think that the emotions one feels at the time of a play impact how we remember said play.

"It is the mark of a truly intelligent person to be moved by statistics" - George Bernard Shaw

This quote is less about insulting people who disagree with me about statistics, and more about it being really hard to find a positive quote about statistics. Statistics cannot tell us very much about individual plays, but they are useful in discussing larger samples. The broad brush of statistics undoubtedly obscures some of the fine brushwork provided by the eye test, but unless you're Seurat, using a fine brush to cover an entire canvas is an exercise in futility. I'm going to go deeper into this pointillism analogy (because I'm nothing if not pretentious). This is Seurat's A Sunday on the Grand Jette: kJLHhju.0.jpg

The red box is my addition, and it contains light green and dark green, right?

WRONG! RgxDHQt.0.jpg

It certainly does contain light green and dark green, but there is also browns, reds, and blues. The difference between looking at a Seurat and watching football is that those contrasting flecks of red can often be magnified by context. We remember some of those red flecks more than others because they mattered more. But when you back up, the masses of blues and greens drown out the reds. The power of statistics is that it allows us to back up. It allows us to see the patterns in the whole canvas, even if the some of the finer details are lost.

"Welcome to Lake Wobegon, where all the women are strong, all the men are good-looking, and all the children are above average." - Garrison Keillor

How many average teams were there in the NFL last year? Or, rather, how many fans would've described their team as 'average' last year? I'm not working with any data, but I'd guess that we'd a bimodal distribution, where people are more likely to call their team bad or good than average or mediocre. bimodal1344437652326.0.gif

I don't know if team quality is distributed on a Gaussian bell curve, but there were 9 teams within 1 win of .500, nearly a third of the league and 18 teams within 2 wins of .500, which kinda sorta suggests a bell curve. But the point is that people don't seem likely to call their team average, because they don't know what average looks like. Statistics can tell us this. They can tell us the average (mean or median) 3rd down conversion rate, points/game, yards/attempt, TD/INT ratio and much more. This gives us a point of reference, to which we can compare to see if our team is as good or bad as we thought. It is exceedingly difficult to do this with the eye test. While the eye test can probably tell the difference between how good the Seahawks are compared to say... the 49ers, how well does the eye test do in determining the magnitude of that difference? Or comparing the Seahawks to the rest of the league. One might not have to watch every snap of every other team, but it probably takes a lot of watching to determine a baseline for average performance.

"Wisdom begins at the end" - Daniel Webster

Ok guys and gals, here comes the wisdom. The eye test and statistics are both valuable. The each have their uses and they are complementary tools. The eye test is great for inspiring hypotheses and describing the actual action of individual plays. Statistics are great for looking at large and long term trends and for creating points of reference in order to create meaningful comparisons. Football discussion requires both. But we must be careful when applying them. We must be aware of the blind spot that statistics have for the actual action of individual plays and aware that our memories often mislead us in our recollections of the past. When the eye test and the statistical record disagree, it doesn't mean that one is wrong. It probably means that both are 'wrong' and that the answer lies in the middle.

Go Hawks.