the spread the (data) science of sports


Sat 30 November 2013

What's all this?

I'm Trey Causey. I'm a data scientist. I'm a football fan. This site is an attempt to bring the two together, in the hopes of achieving two goals. First, to kickstart the use of methods from data science in the football analytics world. Second, to teach some introductory data science using interesting, substantive, real-world examples.

The name of the site is a riff on both the spread offense and the spread used in betting. This is not a betting site, does not offer any advice on betting, or endorse sports betting in any way. That being said, the betting world is often a few steps ahead of the game when it comes to analytics and forecasting.

It's a great time to be involved in sports analytics. Baseball has already seen its sabermetric revolution. Basketball is quickly following, especially with the introduction of SportVU and related technologies.

Football has been slower to warm to advanced statistics. Of course, absolutely fantastic work is being done by Brian Burke at Advanced NFL Stats, the Football Outsiders crew, and Chase Stuart at Football Perspective (to name only a few). Yet, football analytics remains largely dominated by simple cross-tabs, linear regression, and ad hoc analyses that select on the dependent variable, fail to check model assumptions, eschew out-of-sample testing, and generally don't capitalize on tremendous advances in probabilistic modeling. And if you don't know what this mean, hopefully I can teach you.

When advanced analysis *is* conducted, it's often behind closed doors. Understandably, teams want to preserve any edge they find. However, this is not only bad for the analytics community, it's bad for the advancement of football analytics. As has been pointed out on the Advanced NFL Stats podcast (I can't remember who, sorry!), without peer review, isolated analysts often have no objective check on the quality of their work.

Data science and football. Together at last.

blog comments powered by Disqus