clock menu more-arrow no yes mobile

Filed under:

Measuring the myth of a Super Bowl Hangover

Super Bowl winners have no chance to repeat, and Super Bowl losers are downright cursed. Or are they?

Mark J. Rebilas-USA TODAY Sports

We Seahawk fans have endured many terrible hardships as a result of our team winning the Super Bowl, including the constant repetition of the "Champions Don't Repeat" mantra that started about the time the clock hit 0:00 at MetLife Stadium.

The argument for that is sketchy. I won't go so far as to say its proponents are morons (not when I can so easily imply it), but the evidence takes the form of "No Super Bowl winner managed to return to the Championship game from 1991 to 1993, from 1995 to 1998, from 2000-2004, or from 2006-2014." Yeah.

Multiple endpoint fallacies aside, I read passing mention of the more generic Super Bowl Loser's curse in a 49er-related article a few weeks ago . And while cooking up an analysis to challenge just one of these alleged Curses is too much of a bother, why not kill two mummies with one silver dagger?

So today's question is: Does winning and/or losing the Super Bowl cause a team to be worse the following year?

Regression to the Mean

is a statistical phenomenon whereby "if a variable is extreme on its first measurement, it will tend to be closer to the average on its second measurement."

If you will indulge me for a moment, my conscience requires some philosophical edification for the reader; regression is subject to some horrifying fallacies.

Because it is a statistical phenomenon, it occurs at the level of the observer. The measurement regresses to the mean, the thing being measured does not. Being exceptionally good or bad (or whatever) does not cause a thing to become more average.

Confused? Hang tight. I'll fix that:

Throw the names of all 32 NFL teams into a hat, draw one out at random, and measure the team quality with your ranking of choice-- let's use win-loss percentage. You pulled out the New England Patriots, who are at .800*. Throw them back and draw again. What's the expected win-loss percentage of the next team drawn? Over many trials, it will average .500. That's a no-brainer. That's regression all the way to the mean. The Patriots did not get worse, you just took a different measurement.

* numbers computed following week 16

Now start over:

Throw the names of all 32 NFL teams into a hat, draw out two at random, and measure the total team quality by combined win-loss percentage. You've got the Denver Broncos (11-4) and the Indianapolis Colts (10-5), who combine for .700. Now throw back one name at random, draw another, and calculate the new average. Repeat this a number of times (but always starting with the Broncos + Colts), and the average combined win-loss % after the second draw will be .617. We can call this regression toward the mean. We changed part of what was being measured, but not all.

That's all there is to it. Every NFL team changes from year-to-year, with different players, different coaches, different schedules, different fumble luck, and different injuries. They don't change entirely, but we don't need to know exactly how much. As long as we know that at least some franchises, over the course of many years, pass through good and bad stages, we can predict regression to the mean with absolute certainty: If a team is measured to be "good" in any particular year, the average measurement the following year (for the same team) will be somewhat less good (and likewise for teams going from bad to less bad).

And just to drive home the fact that this is a measurement phenomenon and not a causal one (being good does not cause a team to get worse), consider that it is completely reversible: If a team is measured to be "good" in any particular year, the average measurement for the previous year will be somewhat less good.

Let's Measure That Puppy

The goal here, if you haven't guessed, is to isolate the expected regression to the mean regardless of participation in the Super Bowl, so we can see whether or not the Championship experience actually causes a team to get better or worse.

I started with 1994, when the salary cap and free agency began, thus establishing (we hope) the modern expectation for regression. I measured team quality using Pro-Football-Reference's Simple Rating System (SRS), which is simply a team's regular-season per-game point differential, adjusted for opponents. The SRS might not be as sophisticated as, say, DVOA, but it lets me examine a lot of data points without hiring a research staff. And it's actually better than win-loss percentage at predicting champions. Typical SRS across the league ranges from +12 to -12, with about 22 teams falling within one standard deviation of average (+6.5 to -6.5).

So for roughly 617 team-years (20 years X 32 teams less 23 empty slots pending expansion), the average regression towards the mean was 2.70 SRS. That includes good teams getting worse as well as bad teams getting better (if you went from -6.0 SRS to -3.3 SRS, I measured that as a a 2.70 regression). This average includes "negative regression"-- for example, if the Jaguars went from -12.0 SRS to -3.0 SRS (a regression of +9.0) and the Titans went from -3.0 SRS to -6.0 SRS (a negative regression of -3.0), the average among those two teams would be (+9.0 - 3.0)/2 = 3.0.

Next step: Teams who were more extreme (good or bad) tended to experience more regression. Somewhat arbitrarily, I used the average regression to set off a range of good teams whom we would expect to remain at or above average even after regression-- namely, those who started with an SRS of at least +2.70. Within the sample, 215 teams met that standard, having an average SRS of +6.56 and an average regression of 3.59 the following season. Weighting each team's percentage decline equally, such above-average teams lost an average of 52.72% of their SRS from one season to the next (including many that gained SRS by getting better, and some that lost more than 100% by dropping below average).

And we're almost done. We just need to look at each Super Bowl participant's SRS change from one year to the next and compare it to what is expected via regression to the mean. Any loss in team quality (measured by SRS) beyond that is measured as the Super Bowl "hangover" (negative numbers indicating they are worse than expected):

Season Super Bowl Winner Super Bowl Loser
Team prev SRS next SRS expected hangover Team prev SRS next SRS expected hangover
1994 Dallas 9.60 10.10 4.54 5.56 Buffalo 4.80 -0.20 2.27 -2.47
1995 San Fran 11.60 11.80 5.48 6.32 San Diego 3.60 1.50 1.70 -0.20
1996 Dallas 9.70 2.40 4.59 -2.19 Pitt 4.60 5.20 2.17 3.03
1997 Gr Bay 15.30 7.70 7.23 0.47 New Eng 5.10 5.30 2.41 2.89
1998 Denv 10.70 8.90 5.06 3.84 Gr Bay 7.70 5.00 3.64 1.36
1999 Denv 8.90 3.40 4.21 -0.81 Atlanta 10.00 -7.10 4.73 -11.83
2000 St. Lou 11.90 3.10 5.63 -2.53 Titans 1.00 8.30 0.47 7.83
2001 Ravens 8.00 3.20 3.78 -0.58 NYG 2.40 -1.80 1.13 -2.93
2002 NWE 4.30 4.00 2.03 1.97 St Lou 13.40 -3.30 6.34 -9.64
2003 Tampa 8.80 1.60 4.16 -2.56 Raiders 10.60 -5.50 5.01 -10.51
2004 NWE 6.90 12.80 3.26 9.54 Carolina -0.90 -0.70 -0.43 -0.27
2005 NWE 12.80 3.10 6.05 -2.95 Phil 5.60 -2.30 2.65 -4.95
2006 Pitt 7.80 3.40 3.69 -0.29 SEA 9.10 -3.60 4.30 -7.90
2007 Indy 5.90 12.00 2.79 9.21 Bears 7.90 1.20 3.74 -2.54
2008 NYG 3.30 8.40 1.56 6.84 NWE 20.10 3.90 9.50 -5.60
2009 Pitt 9.80 1.70 4.63 -2.93 Arizona -1.90 -0.30 -0.90 0.60
2010 Saints 10.80 2.30 5.11 -2.81 Indy 5.90 2.90 2.79 0.11
2011 Gr Bay 10.90 11.40 5.15 6.25 Pitt 10.20 5.30 4.82 0.48
2012 NYG 1.60 6.20 0.76 5.44 NWE 9.30 12.80 4.40 8.40
2013 Ravens 2.90 -3.50 1.37 -4.87 San Fran 10.20 10.10 4.82 5.28
2014 Seattle 13.00 9.50 6.15 3.35 Denver 11.40 9.60 5.39 4.21
Average 8.79 5.88 4.15 1.73 7.15 2.20 3.38 -1.17

Reading the Data

Are Super Bowl participants, on average, worse the following year? Yes, they are. And they're typically worse the previous year, and in any randomly chosen team-year 20 years before or after their Championship appearance. That's because we happen to be looking very good football teams.

Do they decline more (or less) than expected, as compared to other teams of similar quality? Nope.

There remains a small discrepancy, however, whereby the Super Bowl loser appears to fare worse than the Champion. And you might think this supports the notion that finishing second among a field of 32 teams someone deflates a team's confidence. But you'd be wrong.

No matter how good SRS is (it's decent), we're using it to measure objective team quality, and all such measurements (including win-loss record, DVOA, AFA's Team Efficiency) are subject to inaccuracy. Even with opponent adjustments, a team playing an opponent with an injured starting quarterback (for example) who is healthy the rest of the season will likely be measured as better than they really are; and a team playing an opponent with a healthy starting quarterback who is injured the rest of the season will likely be measured as weaker than they really are.

With a sample that is unbiased by any other indicator of team quality (all teams, all division winners, etc.), inaccuracies would tend to be evenly distributed, and final averages would balance out. But this is not an unbiased sample! Super Bowl participants (winner and loser) also won through a playoff bracket, so we should expect the entire collection to have a slightly higher objective quality than measured by their regular season SRS. And, indeed, the combined "hangover" (winners + losers) shows less regression to the mean than expected (by 0.56 SRS the following season).

And, finally, there is the Super Bowl itself. The result of the game doesn't cause either team to get better or worse. Rather, it's one more significant data point which further divides our sample into teams which may have been slightly over- or under-rated by regular season numbers. And, notably, it divides the sample according to team quality at the very end of the season, which means it has a slight recency advantage over regular season performance as a predictor for next season (but note that regular season performance is still a larger sample size). We would probably get similar results comparing the following season's regression for Wild Card round winners and losers, but nobody talks about the Wild Card round loser's curse. (Yet.)

In other words, if two teams show up to the Super Bowl with identically-measured team quality from the regular season (by SRS), the team that is objectively better (at the end of the season) is far more likely to show up in the Super Bowl winner's bucket, and a team that is objectively worse is far more likely to get measured as part of the Super Bowl loser's bucket.

A final analysis bears this out:

Among the nine Super Bowl losers who had a measured hangover of at least -2.0 SRS the following season, seven had a week 17 weighted DVOA (from Football Outsiders) lower than their total DVOA. Which means there was already a measurable decline unrelated to the post-season.

Team Total DVOA Weighted DVOA Difference
93 Bills 8.7 -2.9 -11.6
98 Falcons 18.8 30.3 11.5
00 Giants 9.3 14.1 4.8
01 Rams 25.9 21.9 -4
02 Raiders 28.5 26.5 -2
04 Eagles 23.7 20.4 -3.3
05 Seahawks 28.4 26.2 -2.2
06 Bears 23.9 14.4 -9.5
07 Patriots 52.9 42.5 -10.4
Average 24.46 21.49 -2.97

You may well ask, then, how it's possible for a team that is allegedly overrated (by regular season measurement) and suffering a late-season decline to make it to the Super Bowl at all. First of all, they are neither declining nor overrated by very much. These are still some very good teams! Second, the regular-season performance usually sets them up with favorable playoff seeding, which is a really big deal in the NFL because of the bye week.

Over the last 20 years, eight Super Bowl Champions have had to play through three rounds of playoffs, including four Wild Card teams. Which means only 60% started the season well enough to earn a first-round bye. By contrast, 85% of Super Bowl losers had a bye week, and only the 1999 Titans (at 13-3) were not a division champ.

Super Bowl losers, average hangover by seed (includes 2014, 21 seasons total)

#1 seed (10 teams) = -2.28
#2 seed (7 teams) = +0.14
#3-4 seed (3 teams) = +2.72

EDIT: There were four transcription errors discovered after publication, most notably for the 2012-2013 Ravens. The corrected figures are now posted; Super Bowl Champions still perform better than expected regression, but by a slightly smaller margin.

It should be noted, however, that the decline/gain in total SRS (vs expected) is being averaged, so that a team like the '07 Patriots, with a previous SRS of 20.1, has a hangover weighted 10 times as much as the '08 Cardinals, who actually improved following their Super Bowl loss. Measuring each team's percentage decline, the average Champion lost 15% of their SRS and the average loser lost 34%. Both figures show the complete opposite of hangover, i.e., a regression of much less than expected, given that the league average among similar teams was a 53% decline in SRS.

Measuring a percentage decline (or improvement) per team does tend to shift the weighting the other way, however (teams with a previous SRS near 0 can go up or down several hundred percent). The most meaningful average perhaps lies somewhere between, but I will leave the original formulae in place, as they are unbiased and adequate to the thesis.

Conclusion

A number of sage observations have been bandied about to explain the decline of Super Bowl participants. Factors such as injury, aging, loss of players via free agency, and adaptation by the rest of the league can be collectively described as regression to the mean, and a simple measurement shows that Super Bowl participants do not regress any more than expected for above-average teams.

Other factors specific to the teams in the Championship game, such as loss of motivation (winner), loss of confidence (loser), and physical/emotional exhaustion can be collectively described as "hangover". Barring further evidence, hangover is a complete myth.

Eight of 20 Super Bowl losers regressed less than expected. The average regression among all 20 was slightly high, but no more than could be predicted by measuring (but not overmeasuring) their progression throughout the season and performance throughout the playoffs.