This post is a companion piece of our Viz of the Week, which you can check out here.
Alright, I have been wanting to look into this for a while now.
We (in the Royal sense of the word) attempt to make sense of the world around us using statistics. However, as the old saying goes, “correlation does not imply causation.” In other words, teams who win also perform well in certain, key statistical categories. That is fundamentally different from saying that teams who perform well in these statistical categories are guaranteed to win.
Just because you score lots of Touchdowns, does not mean you will win, especially when you give up more Touchdowns to the opponent. It seems obvious, but it’s critical to explain this before we jump into the analysis.
I’m going to start with a quick overview of our methodology. If you don’t care, or you want to jump right into the findings, just skip over the “Our Methodology” section to the “How to Win an NFL Football Game” section.
Let me start off by saying that this little research project is a good start, but it is by no means comprehensive and there is much more work to be done. For starters, you cannot be good at one thing and expect to win a Super Bowl. Yet, our analysis isolates specific team stats and univariably compares them to wins.
That’s a mathematical way of saying, we are looking at these team stats and their relationship to wins one at a time rather than in conjunction with one another.
To do this analysis right requires a multiple regression analysis, and that will come later. For now, we are presenting multiple linear regression analyses.
It’s a jumping-off point.
In the future, the goal is to find a “winning formula” rather than finding “certain things that are correlated to winning,” which is what we are doing here.
Secondly, for the statistically uninitiated, there are three key statistical measures that we are using to determine the most important indicators that translate to wins.
1. Correlation - Pearson Correlation (r): this measures the relationship between two variables, and it can range from -1 to 1. If r =1, there is a perfect, positive correlation between the two variables. Under these circumstances, if the stat goes up by 1, the number of wins would also go up by 1. If r =-1, the exact opposite would be true (stat goes down by 1, wins go up by 1). If r = 0, there is no discernible correlation.
2. R-Squared (r2): measures the percent of the variability in the dependent variable (in this case wins or losses) that is explainable by the independent variable (in this case the indicators that we will examine). So, if a stat has an r2 of 0.7, then 70% of the variation in wins is explainable by the selected indicator.
3. Statistical Significance - P-Value: I won’t get into hypothesis testing here, but in short, the smaller the p-value, the less of a chance that the results we saw were random. In other words, the smaller the p-value, the more statistically significant the result.
With all that out of the way, let’s look at how to win a football game.
How to Win an NFL Football Game
1. Capitalize on Red Zone Opportunities (Red Zone Touchdowns, r2: 0.707 p-value: <0.0001)
This one seems a little obvious, but it is the single most important stat that separates the winners from the losers. You must take advantage of your Red Zone Opportunities. Furthermore, your opportunities must result in Touchdowns, not Field Goals.
New England, Green Bay, and New Orleans have been masters at finishing their Red Zone trips with Touchdowns over the years, and it shows up in the win column in a big way. Those three are the leaders in the clubhouse in wins since 2009.
The laggards meanwhile are the usual suspects. The Browns, Jaguars, and Raiders have been downright abysmal in the same category. These three teams on average have about 240 Red Zone TDs over the last ten years. The league leaders on the other hand, scored on average 406 Red Zone TDs over the same time period and have more than double the wins.
That is a stark difference between the top and bottom of the league. The winners have scored 69.1% more Red Zone Touchdowns and have 108.5% more wins.
1. Have a proficient passing game (ANY/A, r2: 0.578 p-value: <0.0001; Pass Yds net of Sacks, r2: 0.316 p-value: 0.0008)
In our research, a proficient passing game manifests itself in two ways: (1) Adj Net Yds per Pass Attempt (ANY/A) (2) Pass Yds net of Sacks. Both are crucial indicators for establishing the air assault. The first metric is kind of like “super yds per attempt.” It considers yards, TDs, INTs, sacks, and sack yardage. The second metric is simply total passing yards minus yards lost from s
The idea here is you cannot just pass for a whole bunch of yards. All those yards must yield results (i.e. touchdowns without turnovers).
These two metrics also manifest themselves in the win column with a much stronger statistical relationship than any of the rushing categories.
This is a topic for another day, but the passing game is much more important in today’s NFL than a rushing game. However, it’s not clear as to why that is the case. Is it because the passing game is truly more effective? Or, is it just because teams are not running the ball? Or, does the value of a rushing game manifest itself in more subtle ways (i.e. play action is a passing tactic that is incumbent on a team’s ability to run the ball)?
It’s a fascinating topic, and it will become more interesting to examine as the (Kyle) Shanahan coaching tree takes root.
2. Turnover Differential is (somewhat) key to winning, but Turnovers are key to losing (TO Differential, r2: 0.569 p-value: <0.0001; Turnovers, r2: 0.604 p-value: <0.0001)
Turnovers had as strong a relationship to losses as Red Zone Touchdowns had to wins. It’s uncanny. If you want to win, score Touchdowns in the Red Zone. If you want to lose, just turn the ball over as much as you can.
Teams that take care of the ball are almost guaranteed to lose. After all, you cannot score if you don’t have possession of the ball. Now, turnovers themselves can be mitigated by generating turnovers of your own, and this is reflected in a team’s turnover differential (turnovers minus turnovers generated). However, the turnover differential had a much weaker statistical relationship to wins than turnovers themselves had with losses.
The conclusion? It’s more difficult to force the other team to turn the ball over than it is to just hang on to the ball yourself. As a result, focus on minimizing your own mistakes and you will minimize your losses. Do not rely on the other team at being more careless with the ball.
3. Convert on Third Down (3rd Down Conversion %, r2: 0.474 p-value: <0.0001)
Sustaining drives is crucial, and failed third-down attempts are the Achilles Heel of your offensive possessions. If you can keep your offense on the field, your opportunities to score increase, while simultaneously preventing the other team from scoring. This supports the old “the best defense is a good offense” cliché.
I am also not breaking any new ground with this point, but 3rd Down Conversion % still shows up ahead of other seemingly more important areas of the game like total yards, completions, and almost every single major defensive metric.
In fact, not a whole lot of defensive metrics showed up at all. While this does fly in the face of the “Defense wins championships” theory, it’s also not all that surprising as you think of how offensively focused the league has become. I mean just look at the pass interference rules.
Now How about a Couple Results that Would Surprise You?
1. There is a surprisingly low correlation between Touchdowns Against, and Losses (TDs Against, r2: 0.112 p-value: 0.0616)
As it turns out, you seem to be better off trying to win a track meet than trying to just hold down your opponent. For example, The Saints gave up a whopping 452 Touchdowns since 2009. That’s worse than the league’s basement comprised of teams like the Raiders, Lions, and Redskins.
Yet, the Saints also had the second least losses in the league since 2009. The Saints nearly have as many wins as the Raiders and Redskins have combined.
Again, as I mentioned in the opening, wins and losses cannot be reduced to one statistic, but it does lend credence to the idea that offense is more important than defense.
2. Penalty Yards don’t really matter in the grand scheme of things (Penalty Yards, r2: 0.053 p-value: 0.204)
Now, I would not be surprised if this does not hold true when looking at an individual game. What I mean is, if you look at a given game, often times the more penalized team and the loser will be one and the same.
However, over the last decade, we do not a significant correlation between Penalty Yards and Losses. The entire league just kind of clusters within the same range. As it turns out, the NFL just really likes throwing flags and does not seem to pick favorites.
3. Giving up rushing yds does not seem to be a big deal, but giving up rushing first downs does (Rushing Yds Against, r2: 0.049 p-value: 0.2245; Rushing 1st Downs Against, r2: 0.350 p-value: 0.0004)
I have a theory that I developed out of this research that goes something like “first downs are more important than yards.” Being able to move the chains, particularly on the ground (i.e. Rushing 1st Downs) showed up big time in our analysis.
And that makes sense, right? If your team is effective at moving the chains, you presumably are not putting yourself in 2nd and 3rd and longs. That means you don’t need to travel as far to keep the drive alive. Also, if you think about the teams who find themselves ahead towards the end of the game, what are they trying to do? Kill the clock by running the ball and just keeping the drive going.
Are you interested in some of the other key factors to racking up the W’s, or even preventing the big L? Check out our Viz of the Week page to perform your own analysis!