clock menu more-arrow no yes mobile

Filed under:

Pictures of a super hot model (for pass protection)

A better model for pass protection and some counterintuitive conclusions it supports.

That's Flo Rida, not Mike Robinson.
That's Flo Rida, not Mike Robinson.
Stephen Lovekin

I swear I have no idea what SEO is. I just like misleading headlines.

Recently I wrote a bit about the pass rush. In the article I used a very simple model to predict optimal pass protection decision making. After I wrote it I decided it would be pretty trivial to design a more robust model for describing the effects of pass protection decisions.

I was wrong.

The world is ineffably complicated. So complicated that pass protection, despite being such a very small part of the world, is still beyond casual understanding. I doubt there are more than a few hundred people who intuitively understand pass protection well enough to justify having a job dealing with it in the NFL. Those people will have spent years studying the game, maybe years playing, and certainly decades watching. Finally, after all of that, when someone is willing to pay them to tell young men how to protect QBs, they wont have undergone a paradigm shift in their understanding of pass protection. They'll have only succeeded in creating a more complicated model than you or I have. Absolute understanding is absolutely unattainable and any intuitive model will suffer from the human mind's remarkable ability to grossly misinterpret and misapply learned information.

We need rigorous mathematical or logical models so that we can check them against our intuitive ones. Hopefully the inherent biases of each can be lessened by comparison.

A simple time to pressure model

This is the model I used in my article on defensive pass rush statistics. In this model each pass rusher is assigned a time to pressure (call it "T") based on rushing skill (call it "𝞻") and blocking assigned (call it "β"). If the OC has a finite amount of blocking to assign (some real value "B") such that B=β1+β2+ ... βn and each instance of β can be any real number then the optimal solution is the one where all values of T are the same. If β can only be positive (because what the hell is negative pass protection?) the optimal solution is the one with the lowest range (the best approximation of the true optimal solution).

This is a cripplingly simple model. It is trivial to refute its accuracy by pointing out that, given the same blocking, different defensive players can achieve pressure. But the optimization system it creates (add blocking to the best rusher until he's no longer the best then switch) makes intuitive sense so it did to be going on with for an application where pass protection was a secondary concern.

But I was curious if a model that stood up to more rigorous scrutiny would arrive at the same optimization conclusions.

A probabilistic model of pressure

The principle failing of the simple time to pressure model is describing pressure as universally achieved at a certain moment. Clearly the right way to approach the problem is to view pressure as something that has a certain probability of being achieved as a function of time elapsed since the snap.

For various reasons I believe that probability density function of achieving pressure can be approximately described as a normal curve. My view is corroborated by research so if you're willing to take my word for it I'll spare you more words in an already wordy article.

That out of the way here's a graph showing what it should look like! (the numbers are based off of the NFL averages)

Pp1

The team probability density function (PDF) shows the probability that a team will achieve pressure at any given time. The team cumulative probability function (CPF) shows the probability that a team will have achieved pressure by a given time. [note that CPF is usually CDF but I think my wording is more intuitive]. The PDF is the derivative of the CPF.

The CPF is what we're interested in. The job of the OC is to minimize the CPF for any given passing play. He can do this by decreasing expected passing time (DOOONNNNNNT CAAAARRRE!). Or by reducing the value of the CPF at the expected throw time - this is what we're trying to model.

The the team CPF is simply the sum of the individual player CPFs. So the OC will try to reduce the team CPF by decreasing the values of individual CPFs.

Now i'm afraid I can't keep avoiding using some math that isn't included in liberal art program curriculums. None of it is wildly difficult to understand conceptually and I'll try to explain all the conceptual stuff. That said feel free to skip ahead to the post math conclusions.

Begin math content

Each individual's CPF can be described by the following functions (here expressed in terms of a hypothetical player 1):

P1form1_medium

P1form2_medium

So, that's a lot to digest. Here's an explanation of terms:

  • p is the probability that a player will achieve pressure for any value x
  • x is the elapsed time
  • 𝞻 is the proportion of pressure attributable to a player - here estimated by a function of mean time to pressure. I used this symbol - final sigma - because I think it looks a little like a QB in fetal position. I'm not convinced this estimate is accurate enough to replace actual value [player pressures/team pressures] but using the actual would mean the model isn't usefully generalizable. More research required.
  • 𝜇 is the mean time to pressure
  • n is the number of rushing players
  • 𝜎 is the standard deviation of times to pressure - I used an estimate of 5/6 because I'm lazy and I believe it should be an excellent approximation [expected range/6]
  • erf is the error function. If that doesn't mean anything to you treat it like a mystical black box that spits out magic numbers with help of Guinness fed unicorns.

Here's a hypothetical graph using the following values of 𝜇 for a four man rush:

𝜇1 2.8 s
𝜇2 3.2 s
𝜇3 2.6 s
𝜇4 3.4 s

Indcdfs_medium

That's pretty cool right? Okay, well at least I thought so. The relative probabilities make intuitive sense to me - passing the stupid test for sure. Better pass rushers are always more likely to get pressure than bad ones but the p values are closer at the extremes of the graph.

The last addition needed by the model is the ability to adjust the pass protection assigned to each rusher. To do this I'll add the term 𝜧 and reintroduce β to our vocabulary. 𝜧 represents the actual mean of time to pressure for a player (the career average or projected average being used to estimate) and β represents the difference (in seconds) between blocking in a given play and the average blocking faced by a player. So:

Mumumu_medium

By adjusting β for any given player an OC will impact the pressure probability functions for every other rusher through the 𝞻 value.

I believe it's self evident that in this model an OC's job is to minimize the probability of pressure for a plays expected time to throw. Or, stated another way, minimize the value of the team CPF at the expected time to throw. The question is how to optimally disburse B [available blocking].

In the simple time to sack model an OC could follow the behavior of assign blocking to the best rusher until he was no longer the best and repeating to find the optimal use of B. This behavior is no longer self-evidently optimal in the new probabilistic model.

End scary math, also begin analysis

You're the OC of the expansion Anchorage Anoraks who are down by three in the AFC championship game against the Cleveland Browns. With 10 seconds left in the half and 60 yards to go it's time for a long developing pass play. You'll need 3.5 seconds for your aging QB Vince Wilfork to get the pass off. Cleveland will likely settle into a deep zone with a four man rush using their feared Player 1-4 defensive line. You'll send everyone but the line downfield for the play. Who do you double team?

It seems obvious that Player 3 should be double teamed. He's the best rusher so focusing on anyone ese would be ridiculous. Well, I'll do the math anyways. We'll assume that the double team is worth 1 second for each rusher and stipulate that the values for 𝜇 above are now the values for 𝜧.

Player Double Teamed 𝜧 Team probability of pressure by 3.5s
None -- .7163
Player 1 2.8 .6099
Player 2 3.2 .6185
Player 3 2.6 .6139
Player 4 3.4 .6303

Woah.

The difference is small but the model suggests that blocking the second best pass rusher is the best use of the double team in this case. In other cases blocking even worse pass rushers is optimal. And the assumption that the double team would have the same absolute impact on each rusher makes it even crazier since better pass rushers might be better able to break through double teams. That would lead to an even more dramatic effect.

How could blocking the best pass rusher ever be a suboptimal solution? To see why we'll need more graphy goodness.

Indpdfscdfs

First of all look at all the super pretty colors! Thank you OSX crayon color palette for once again ruining a perfectly professional graph.

Now, note that the PDFs are the derivatives of the CPFs. If you don't follow what that means think of it as a black box that means the y value of a PDF for a specific x value is the slope of the CPF at that same x value.

When we adjust the mean time to pressure for a player by adjusting β the CPF and PDF are translated left or right. The impact of that translation depends on the slope of the CPF over the translation. Here it is fairly easy to see that Player 2 (red) has a higher average slope between 2.5 and 3.5 seconds than Player 3 (green). So it makes sense that the impact of double teaming Player 2 is greater.

Essentially, as time goes by eventually the rate of increase in probability of achieving pressure begins to drop. Therefore, double teaming a player will have less of an impact if he's already hit the point of diminishing returns. If there is another player with a higher rate of return for time over a specific time it makes more sense to block him even if he's worse in an absolute sense.

Actually it's a bit more complicated than that because increasing β actually decreases the slope of the CPF for the blocked player and increases the slope for other players (because of the 𝞻 term). But the impact is plain to see in the numbers.

Conclusions

The primary weaknesses of the model are assuming strictly normal probability distributions and the lack of confirmation that the 𝞻 term is a good approximation of the relationship between individual players' probabilities of achieving pressure in the real world. Furthermore, optimizing a pass blocking scheme on the fly for a large number of pass rushers and possible combinations of blocking is non-trivial. Even if the model were expanded and refined to a point where its decisions were good enough to usefully inform a coach's decisions in game, a team would need to significantly upgrade their in stadium computing power. And all that isn't even considering that I haven't produced a model for the impact of blocking on individual players (in other words one that assigns a β for certain blocks on certain players).

Really this is useful as a check against intuition. A tool to inform better human decision making.

And as that tool it has been useful. According to the model assigning blocks to the best pass rusher is not always optimal. Instead, for longer throw times it makes sense to block players who are worse in the absolute sense but get better faster around the time of expected throwing.

To me that's about as intuitive as the Monty Hall problem. Which is to say that it's not intuitive at all.

Are the differences huge? No. Are they big enough to worry about? Assuming they prove real with further research, I think so. In football there are plays where a whole game, an entire season hangs on the line. Adding a 1% chance of success to that play is a no brainer. That said, the errors in the real world measurements may end up being too large to ever be sure of the optimal decision. In that case, screw you real world.

At any rate, with some tweaks and adjustments maybe this can be installed in the next RoboRussell Wilson version for even better protection calling decisions. Assuming he hasn't already written a better model himself that is.