clock menu more-arrow no yes mobile

Filed under:

Friends don't let friends play sequence. Unless they're robots.

An analysis of NFL play sequencing attempting to detect predictability. Or at least lack of randomness.

No, I'm not saying a computer could do this.
No, I'm not saying a computer could do this.

I wanted to describe this analysis in terms of Bletchley Park but It's too easy, almost lazy, to use war metaphors describing football.

Quarterbacks lob bombs, blocking players engage opponents in the trenches, running backs are tanks, defensive backs blitz, and middle linebackers are field generals. Punters, um, kick stuff. I take it back, it's definitely lazy to describe football as war. Want a challenge? Describe football as a middle school dance.

Football is like a middle school dance.

Each boy wants to go to the dance with a particular girl. Each girl wants to go to the dance with some boy. (For the sake of simplicity, and hopefully without perpetuating the myth of rampant heteronormativity in sports blogging, I've ignored those who secretly want to go to the dance with a kid of the same gender but haven't really managed to embrace that yet.)

If you ask a date too early they'll blow you off because someone better could ask. If you ask someone too late they've probably already been asked. The more people who are interested in going to the dance with someone the earlier it makes sense to ask them.

That means that when someone asks if you want to go to the dance with Kelly (who you are, like, totally in love with) you should lie. if your competitors for Kelly's affection don't know about you they'll ask too late.

Similarly, you should use all of your friends and wiles (which I assume are a kind of speaking parrot) to figure out who else wants to ask Kelly out. That way your timing is optimal.

If you know what the opposing army is doing and they don't have any idea what you're doing then you're totally going to the dance with Kelly

-Sun Tzu

In that sense, play calling is the asking someone out to a middle school dance of the NFL. The less an opponent knows about your intentions and the more you know about his the better.

At this point a lot has been written about optimal play calling. It's the middle school dance punch table of football analysis. It's where all the nerds are hanging out. The general idea is that if you get more value out of running/passing in a situation you should call rushes/passes more in that situation. When the values are equal, play calling is optimal. The reason for this is that a defense must divide its attention between the pass and the run. If you call a lot of passes it will be good for your run yards per attempt as the defense shifts its attention to the pass but you wont be taking advantage of the increased yards per cary because - as we discussed - you're throwing the ball.

There's a huge unstated assumption in that paragraph. If Jaime is reading my diary then it doesn't matter when I'm planning on asking Kelly to the dance. Jaime will just do it the day before me.

If the defense knows the offenses's play call pre-snap it doesn't matter what proportion of plays are called rushes or passes. They're going to get stuffed by the right defense regardless. The scenario assumed that the play calls were unpredictable.

The way I see it there are two ways to make a play call predictable: interception and prediction. Interception is the New England method and it's cheating. It's the selling oregano as weed at a middle school dance of the NFL. You're going to punished by the teachers and all the other students will hate you.

I don't think prediction is cheating though. If a coach calls run-run-pass without fail on every set of downs you don't have to keep pretending he might pass on first down. If a pitcher always throws first pitch fastballs you don't have to look out for a change up. If an organization can't make their play calling unpredictable that's a coaching failure that should be exploited.

Let's make the use of the term unpredictable clear. I can predict that a team facing third and long with 10 seconds left in a game, three points down, and outside of field goal range will throw. Last season I could predict that Detroit was much more likely to throw than run.

That's just acting on knowledge about the optimal play call mix in a given state. If a team were using a random number generator to pick their play calls what we're discussing is the weights they assign to the plays in different game states. Those are public knowledge.

What I'm saying is, what if the number generator itself is broken (or non-existent)? Or what if the people who are in between the generator and the play - head coach, offensive coordinator, and QB are predictably overriding it?

This isn't a scouting exercise about formations or personel groupings. All the information I'll use in this analysis is game state and previous play calls. If this data were applied in the scary uncertain real world it could used as soon as the last play was finished - before substitutions or huddles.

But surely NFL teams don't leave themselves open to these sorts of exploits? Well, it looks like some of them do.


I used the full 2012 regular season play-by-play set. Each play was defined as a pass, run, kick, or no play. Kicks and no-plays were removed from the data set.

I defined passes as a success (1) and rushes as a failure (0) and assessed the number of single outcome runs of any given length for each team. Note that any ambiguous use of run in this article refers to a string of plays of the same type. Ball carries will be referred to as rushes.

Here's a graph on the NFC West data to let you visualize that (note the vertical is in a log scale for ease of viewing)


Again, remember that run is a string of same play types. Not a rush. I just can't emphasize that enough.

The expected number of runs is expressible by the following equation:


  • y is the expected number of run
  • n is the size of the sample
  • p is probability of a throw
  • x is the size of the run

As a rough first pass I compared the actual runs to expected runs for each team and pulled out teams that differed at the .99 level of significance from expected run totals for runs ranging from 1-4 in length. I ended up with this table:

Teams\Run Length 1 2 3 4
Cleveland SIG
Dallas SIG (-)
Denver SIG SIG
Detroit SIG SIG
New England SIG SIG
New Orleans SIG SIG
NY Giants SIG
Philadelphia SIG
St. Louis SIG (-)

The minus signs after Dallas and St. Louis indicate that their totals are lower than expected.

I then took these nine teams and ran multivariate regressions against their play calling (using score diff, yards to go, down, and time left) to get a better estimate of the actual probability of passing in any given situation. Since the resulting string of probabilities can't be expressed as a function to my knowledge I went ahead and Monte Carlo'd the shit out of each team's season - ran many simulations of each - to get a reasonable approximation for the expected runs totals.

The new significance figures follow:

Teams\Run Length 1 2 3 4
Cleveland SIG
Dallas SIG (-)
Denver SIG SIG
New England

New Orleans SIG SIG
NY Giants SIG
St. Louis SIG (-)

Just for funsies here are the actual z-scores (or standard deviations from mean in the expected model)


Note that the sample size for the actual data gets wonky after run lengths of about six.


I'll start by explaining why I did what I did.

If you make up a random sequence of fifty coin flips and then actually flip a coin fifty times and record the results odds are (haha) that I'll be able to tell the difference at a glance. Humans have a deeply seated tendency to avoid long runs. Random chance does not.

There are many ways I could have checked the play by play for signs of non-random selection but I believed that the principle weakness would be in people reducing the number of long runs. That's what we do. Of course, if there are too few long runs then there will be too many short ones.

Of the nine teams I ran through the full analysis to adjust for game states five ended up showing signs of what I expected.

For New Orleans my model expected 235 single play runs. The team had 306. I expected 104 two play runs. Got 147. These numbers are not trivial. After those two, every kind of run, except for the odd 16 or 31 play string of passes to end games, is less prevalent than expected. If I were a DC playing New Orleans I believe I could use this information.

New Orleans is the most egregious example but Denver's totals are also more than three standard deviations removed from expectation.

I honestly don't understand what's going on in St. Louis but they had 196 one play strings against 238 expected. So that may be usable.

Last I want to note New England. After running the passing odds multiple regression and running the simulations their actual values matched up startlingly well with expected. I was surprised to pick them up on the first pass with the naive model but I guess they might be well coached after all.


My multiple regression wasn't the best thing ever. I believe it was adequate and I think the New England data supports that. Simplifications such as a naive pass/run dichotomy make the practical implications of this somewhat suspect. Yup, this is a blog, not a FO - if someone were paying to run this analysis you could bet your sweet boopy that this would be more robust.

You can't draw any conclusions about the twenty three teams I left out of my final analysis. After running a similar process on them some might become significantly different from expected - others might get closer.

I think the safe conclusion is that there is room to exploit certain teams' play calling. Their sequencing is non-random and weakly predictable. Why? I don't know. Bad coaching or putting too much power in the hands of the QB maybe. I doubt this is the sort of gap that will exist in the league forever as coaching staffs become more savvy. But while this exists it should be taken advantage of. Hopefully by the Seahawks.