StatsDetective 5 | NFL season 2024 prediction
- Jonah Vega-Reid
- Sep 1, 2024
- 3 min read
We're back baby! After a short summertime hiatus, the homies are here in time for the football season. Much like with baseball, it will be my goal this season to predict games before they happen. Unlike baseball, I will be pitting the algorithm against the homies for 4 games every week. We'll play the hits, Cardinals, Bears, and Cowboys, along with a special game of the week just for fun. Additionally, Greg's mom Diane will be making picks as our outside expert.
But enough about the future, let's talk about where we are now. The current algorithm (ALGO 1.0) is up and ready to go. I spent considerably less time on the football version than I did on baseball because my approach was entirely overhauled. Firstly, I am not using any stats from the current season. It is my current opinion that these fluctuate too much to be useful. Instead, I will be going with past performance metrics from both players and teams. Let's begin with players, and specifically the most important player the quarterback.
The most important factor, and the one that gives us the initial boost past our coin-flip threshold of 50%, is home-field advantage. By gathering data in a way that takes this into account, I can take advantage of the fact that the home team wins 56% of the time in the NFL. In our very first video, we found that home field is worth on average ~6 points in point differential. Thus, we are only searching for variables that give us an additional 4-5% boost so we can meet our 60% quota. Simple is almost always better in this case and an overspecified model can hurt prediction.
As I have detailed in the past, quarterback metrics are fickle and don't typically perform as advertised. That being said, passer rating is probably the best one we have, and the numbers back that up. A logistic regression model using only away QB passer rating gives an overall prediction success rate of 59%. The same model but substituting completion percentage only gives an overall success rate of 56%, the same rate as home-field advantage alone. But then I came across something dead interesting: the passer rating of the home team's QB is not only useless, but predicts at a rate lower than 50%. The takeaway is that home QB passer rating is pure noise, harmful to a statistical model. The best model by far is one that includes passer rating difference between the two quarterbacks. The idea being that the bigger the gap, the more likely a win for the home team.
For team stats, I went even more simplistic. The win percentage from the previous season is the only thing I looked at. My reasons are simple, less effort, decent percentage boost, and makes the most sense. While there are a few coaching staffs this season, rosters don't turn over all that quickly, quarterback situations are fairly stable, and every other team factor (facilities, fans, ownership, etc.) stays identical typically. I am betting that huge jumps in skill or success are few and far between and stability rules all. However, similar to passer rating, only the previous win percentage of the away team is useful (59% in a model by itself). In this case, the difference of the two or only the home team is dramatically less than that.
Thus we arrive at our model.
Win = intercept (home field) + passer rating difference + away win percentage
The testing yielded an overall success rate of 61% and a wins-only success rate of 63%. Will this be enough to turn a profit? Enough to beat Greg and Nikki and Diane? Stay tuned!
Comments