Wed 23 September 2015
I normally don't write about college football on this blog, but I do write about ranking algorithms. In that tradition, I'm extremely excited to have Matt Mills contribute a new post on using continuous-time Markov chains to rank college football teams. Matt blogs for SB Nation about college football, statistics, and Georgia Tech.
College Football has tried for years to find the best way to rank teams at the end of the year. From the AP Poll, to the BCS Rankings, and now with the College Football Playoff Committee everyone has tried their hat at ranking college football teams. Currently there are 60 college football rating systems listed in the Massey Ratings Composite and over 120 ratings tracked in the Sports Vaporia Comparison. However, that isn’t going to stop me from adding my own to the list. In this post I’ll describe how a continuous-time Markov chain (CTMC) is structured, how we can apply the mathematics of CTMCs to develop a new rating system, and how the CTMC ratings compare to the better-known Massey Ratings. I’ve provided all the code used to perform this analysis in R at my GitHub page.
Continuous-time Markov chains (CTMCs) are mathematical models that are used across many areas of business and academia. If you learned about them in school, you probably used them to help with queuing theory. If you didn’t learn about them in school, don’t fret. The basic structure is easy to visualize. CTMCs consist of a set of states that a system can be in at any given time. For example, the set of states, or state space, of the system can consist of the weather (e.g. raining or not raining), or the number of people in a line (0, 1, 2, etc.). There are also transition rates that tell you how quickly the system can move from any one state to another. These transition rates can be probabilities, in which case the Markov chain would be modeled as a discrete-time Markov chain. In addition to the basic structure of CTMCs, there are more technical requirements for a system to be a true CTMC. The system has to be irreducible, which basically means that you have to be able to, eventually, get from one state to any other in the system in some finite amount of time. What makes the CTMC a Markov Chain is that any future behavior of the model only depends on the current state of the model and is independent of how long the system has been running. This last property is called the Markov Property. There are many great descriptions of Markov Chains online so if you wish to learn any more about these systems I would advise a quick Google search. The point of all these conditions and definitions is that if we can model our system as a CTMC, we can use linear algebra to solve for a steady-state distribution. The steady-state distribution of a CTMC tells you the long-run average fraction of time the system will spend in each state. The goal of this analysis is to model a CTMC with each team in college football representing a state and then determine the quality of each team by computing the steady-state distributions for each state.
Now we need to apply the properties of continuous-time Markov chains to college football. The first step is to define our state space, which is the list of all possible states for the system. In our case this is just the list of all teams in college football that played against at least 2 FBS opponents. I chose this condition so that teams that only played one game were left out of the analysis to avoid weird results. Next we need to define the transition rates between the states. Normally the rates would be exponentially distributed random variables, and in queuing theory they would represent arrival rates or service times. In this example, we will be using the number of points scored by one team against another as the rate of going from one state to the next. I don’t think this should cause any issues, but I admit I may be wrong, so feel free to discuss this with me in the comments. If we model our CTMC like this we will get a transition diagram that looks like this:
In this example each node represents a state and each arrow represents the transition rate of going from one state to the next. For example, Georgia Tech scored 28 points against Miami, while Miami scored only 17 against Georgia Tech, so the rate at which a person in state “Georgia Tech” would go to state “Miami” is higher than going from Miami to Georgia Tech. The steady-state distribution of a CTMC can be thought of as the amount of time someone would spend in a state if they were walking along this graph structure. If they were in the state “Georgia Tech” they would be more likely to move to North Carolina than either Duke or Miami, because Georgia Tech scored more points against UNC than against Duke or Miami. This means the rate at which someone leaves Georgia Tech to go to UNC will be higher than that of going to Duke or Miami. The person won’t always go to UNC but will do so more often than going to any other state. Therefore, the system will quickly leave the state of any team that scores a lot of points, because the transition rates will be higher than the average. A team that doesn’t allow many points will have few transitions into their state because the rate at which teams scored against them was low. Because of this behavior the system will spend less time in good teams’ states and more time in bad teams’. The steady-state distributions are now a way to rate college football teams.
Now we have to go about actually generating and solving the CTMC. To do this, I used R’s XML package to scrape the scores from all games in the last four seasons, including this year’s, from www.sports-reference.com/cfb. In order to make the scores location-neutral, I adjusted them by subtracting 1.5 points from the home team and adding 1.5 points to the away team. I also removed any games where both teams didn’t play at least two games against FBS foes in the same season. This was as close as I could get to including only FBS-FBS games because Sports-Reference.com didn’t list the division or conference of each team in their schedule page. Once I had this full list of games, I used the tidy and dplyr packages in R to generate the transition rate diagram for all games in a season. To actually solve for the steady-state distributions of our system, we need to transform the transition rate diagram into the proper format. Here are the actual equations we need to solve to find the steady-state distributions:
What those equations actually mean is that the rate at which you leave a state, weighted by how often you are in that state, has to equal the rate at which you come into that state, weighted by how often you are in the state from which you are coming. Basically, rate in equals rate out. We must force all the probabilities to sum to one and replace one equation from any state with this equation:
We can now solve the equations by putting everything on one side of the equals sign and using linear algebra to solve the system of equations. I used R’s solve() function to generate the long-run average fraction of time the system spends in each state. As I said earlier, we can simply equate this result to the strength of a team to get a rating for every team in college football. However, this result isn’t very intuitive; the better teams have smaller numbers. I changed the format of the transition rate diagram so that the system will spend more time in the better teams’ states, instead of less. The correlation between the natural log of the different perspectives of ratings is -.99. The ratings seem to fit a power law distribution, so the log scale makes them linear. With this change, here are the top ten for this season:
A continuous-time Markov chain is a good rating system for college football because of its ability to take a team’s schedule into account. The system naturally spends more time in stronger team’s states. Teams that play these strong teams build connections between the two states by scoring points. So by building a bridge between a your team’s state and a strong team’s state the system will spend more time in your state, even if you got beat by the strong team. This builds up the ratings of teams with tougher schedules. Even if Georgia Tech, for example, gets blown out, there is some probability that the system will transition from another really good team’s state to Georgia Tech’s, thereby boosting Georgia Tech’s rating because it played a tougher schedule. If a team’s opponents aren’t very good, the system won’t spend much time in its state because it won’t spend much time in the states of its opponents. I think this is a good explanation for why Arkansas is so high in these ratings: they have played an incredibly tough schedule. This rating system is not perfect, but I’m okay with this top ten. Six of the teams in the CTMC top ten are in the top ten of the Massey Composite Ratings.
Now that we have a way to rank teams, we also need to check how accurate the ratings are, beyond an eye test. I’ll do this by testing the predictive capability of the CTMC ratings and measure it against the Massey Ratings performance. The Massey Ratings also use linear algebra and the number of points scored by each team in each game to develop a rating for teams that represents the number of points per game above or below an average team. Let’s compare how the CTMC ranking of each team this season compares to the Massey Rating of each team. I’ve highlighted the playoff teams (Alabama, Oregon, Ohio State, and FSU) by their respective team colors.
The two ratings are very similar, but the CTMC rating seems to have a non-linear scale. The correlation between the Massey Ratings and the natural logarithm of the CTMC ratings is .97, so they are very similar.
The real test between the ratings will be on how they accurately predict games that haven’t happened yet. To test this, I calculated the ratings of each week after week four of the last four seasons of college football. The ratings after each week were used to predict a winner of the games in the upcoming week. The predicted winner was chosen by which team’s rating was higher, not accounting for home field. I’m not trying to make the best prediction but want to see how they compare on a level playing field. The results of the weekly prediction can be seen in the graph below. The dashed horizontal lines are the yearly accuracy rate for each method.
In two of the years, the Massey method was the more accurate predictive rating system, and in the other two years there was virtually no difference. The methods fluctuated in their weekly accuracy, but neither consistently outshone the other. The prediction accuracy percentage hovers in the high 60s and low 70s, which is in line with other rating methods on the Sports Vaporia Predicted Wins Summary.
If the CTMC rating has a .97 correlation with the Massey rating then why use it? Well, besides the fact that I think using continuous-time Markov chains to rank college football teams is pretty cool, the CTMC rating has flexibility for future metrics beyond points scored or points allowed. If you think yards per play is a better metric than points scored, you can easily switch the transition rates from points scored to yards per play gained. If you just wanted to measure a team’s passing strength you could use yards per pass only. There are some limitations to the CTMC rating system. I haven’t thought of a way to separate the ratings into offense and defense components like the Massey ratings. There also isn’t a clear link between the ratings and margin of victory like with the Massey rating. And lastly if a team gets shut out then the transition rate for that game would be zero, which may not be the best way to take that into account. And in the spirit of keeping this post from becoming a novel I’ll save that for another time.