the spread the (data) science of sports

The sorry state of football analytics

Wed 23 September 2015

"I got a lot of respect for analytics and numbers, but I'm not going to make judgments based on those numbers. The game is the game. It's an emotional one played by emotional and driven men. That's an element of the game you can't measure. Often times decisions such as that weight heavily into the equation."

That's Pittsburgh head coach Mike Tomlin, quoted on Steelers Depot, brought to my attention by Mike Lopez. This comes after it was announced that the same Pittsburgh team hired Karim Kassam, a former Carnegie Mellon professor, to head up their analytics effort full time. The cognitive dissonance, it is strong. So is the irony, as Tomlin appears to be describing some sort of optimization procedure by which one assigns weights to factors that contribute to some outcome. If only such a procedure existed outside of magical thinking.

NOTE: I have met Karim Kassam, he's very intelligent and a great analyst. None of this applies to him or his work and, for all I know, he is a respected and valued voice in the Pittsburgh front office, but the juxtaposition of these two stories was too much to resist as a motivation for this post.

I wish that this were an isolated example, but this is a frankly unsurprising occurrence. The sad truth of the matter is that the state of football analytics in 2015 is not good and isn't showing signs of improving. This is especially true in the NFL, though I think a lot of this applies to college football as well.

The body of football research is not advancing with the same rate and is not of the same quality as in basketball, baseball, or hockey. At least publicly, teams are not generally investing in analytics talent in the same way that other sports are. Even when they are, as evidenced by the Steelers above, there is little evidence that teams are incorporating many of the most basic quantitative lessons from the analytics community either on or off the field.

Conference presentations: where are they?

The New England Symposium on Statistics in Sports (NESSiS) is this weekend and looks to feature, as always, work that is not only interesting but methodologically sound. As I was reading the program, though, I was struck that there were only two football-related papers. Further, one was on the topic of Deflategate and the other was actually about ticket prices. I did some digging and, in the history of NESSiS, which has happened every other year since 2007, there have only ever been two other NFL papers. One of them is from 2009 (6 years ago!) from Ben Alamar, the current head of ESPN's Analytics department, on how NFL coaches do not act rationally -- something that seems to be about as constant as death and taxes.

Sloan doesn't present a rosier picture. Football-related research papers in the past 5 years or so have addressed scheduling of games, the inefficiency of the draft, and predicting field goal success. This is among many more high profile panels with such heart-warming titles such as "Gut vs. Data -- How Do Coaches Make Decisions?" and "In-Game Innovations: Genius or Gimmick?"

It's not as if there are other conferences that are showcasing football analytics work, such as baseball's SABR conferences. The work just isn't being done.

Empirical disappointments: 4th downs and the draft

If there's innovation occurring in football, it's not being presented at conferences. But maybe it's just the case that the work is being done behind closed doors? If so, we certainly aren't seeing evidence of it.

There aren't that many quantitative "truths" in football, but we know of at least two: teams should go for it on fourth down and they overvalue picks early in the draft. We've known the former since at least 2002 when David Romer wrote his famous paper and we've known the latter since at least 2005 when Massey & Thaler wrote their seminal work on the draft. And yet, based on my calculations, the rate of going for it on 4th down has remained mostly stable (it was 12.3% last year, compared to 12.3% in 2001). There was a brief period in 2007-2009 when the rate crept up to 14% but has since come back down to 2001 levels.

We also know that teams routinely trade up in the draft. In this year's 2015 NFL Draft, San Diego traded a 4th round pick and a future 5th round pick to move up two spots in the 1st round.

The fact that we're still arguing over things that have been known for more than a decade is absurd.

New kinds of data: good for what?

We're hearing a lot about the use of Zebra data to track players' locations on the field to the millisecond level. I was initially quite excited for this, as I've seen how much interesting work has come out of the SportVU technology used in the NBA. Hard and previously unanswerable questions like quantifying defense are being tackled using this data and sophisticated methods. However, I am highly skeptical that many NFL teams a) have analysts capable of dealing with and extracting useful information from multi-terabyte files, b) are willing to invest in hiring people that can do so, and c) will actually use that information once they do. We are still arguing over very, very basic questions about expected value on play types with very acceptable sample sizes.

If you can't convince a coach or front office that there are wins laying on the table by not going for it on 4th down, or that trading up for a player who has a significant probability of being a draft bust is a waste of resources, how are you supposed to convince that coach that your model found something scouts didn't see?

Communication red herrings

The answer to the above question, of course, is always that the burden is on the analyst to make their work approachable and digestible by a coach or GM. Of course that's true, but it's also a a red herring. Dozens of people have made the case for why going for it on 4th down is a good idea in many situations (or at least more situations than currently observed) -- it's not a crazy idea. Yet, there's very little buy-in. Coaches have a million special reasons why it wasn't the right time to go for it or why the models can't account for whatever micro-climate existed on the field that day.

It's a convenient way of using a tired stereotype (the ivory tower academic who doesn't know football and can't talk to football people) to justify continuing to move the goalposts. Notice, too, that the onus is never on the front office to try to understand analytics; it's a one-way street.

Brain drain on the horizon

We continue to see people make arguments that football is more complicated than baseball, basketball, or hockey. That may be true. We also hear that the sample sizes are smaller. Also true! But nearly every other industry in the world is pursuing the use of data science and quantitative methods to gain a competitive advantage. Do you think organizing the world's information is difficult? It is, but Google seems to be doing OK. Uber seems like it's solving complex optimization problems successfully. Video games seem to manage to produce hyper-realistic versions of the same game we say is too complex to model with statistics.

Football is not a special snowflake that has somehow miraculously produced the most unique phenomenon on the planet that can't be studied quantitatively. Ironically, we see extreme faddishness in the league around other topics, just not analytics (Wildcat offense, anyone?)

I fear this is going to lead (or continue) a "great stagnation" in football. Teams won't compete with other industries on pay or benefits -- that much is clear -- but they claim people will line up for the privilege of working in sports. Yet, when coaches and GMs routinely and publicly throw analytical work and analysts under the bus, why would a rational person stick around the sports business for long?

I know one of the most important things for me in any job is the feeling that I'm being heard, my work is important, and I'm having an impact. We're hearing the exact opposite from coaches and GMs on a regular basis, and that simply isn't tenable if you want to attract and retain talented people to work on hard problems. You can't remove both extrinsic and intrinsic rewards and expect success.

Conclusion

I am not optimistic for the future of football analytics, which is truly sad. Innovation keeps games interesting and innovation from multiple sources encourages the evolution of the game. As it stands right now, football appears to have walled itself off from meaningful quantitative innovation without any signs of lowering the gate.

blog comments powered by Disqus