Big Chances Part 2 – Explaining and predicting

The other day I wrote a short post showing some numbers on the amount of big chances (according to Optas definition) that teams created and conceded across Europe in the 2014/2015 season. Feel free to read that post by clicking this link, though I should admit that there isn´t really anything in there that you need to know to be able to grasp what´s coming up in this post. I just want you to read my stuff!


Explaning power:

The post linked above focused solely on how teams did in the measures using Big Chances, without really getting into whether those measures are any good to use or not at all. Any measure that is supposed to say something about a teams inherent strength needs to have a high correlation to goal difference or points won, and it must also be repeatable over time.

Using shots data from the Bundesliga, Premier League, La Liga, Serie A and Ligue 1 from the last two seasons (unfortunately that is all I have) I set out to find out how good a job the various ”Big chance”-measures does when it comes to explaining what has already happened and predicting what will happen.

I decided to look both at the ”Big Chance Ratio” mentioned in the previous post, and the ”Big Chance differential”, where the former is calulated by dividing Big Chances created by Big Chances Created+Big Chances Concded to arrive at a measure with an average of 0.5, whereas the latter is calculated by subtracting the Big Chances conceded from the created ones.

As it turns out that Goal Ratio had a slightly stronger correlation than goal difference to points won per game in my dataset, I decided to compare the correlations for all measures considered both with actual points won, but also to the goal ratio of a team. Starting off with the ”Big Chance Ratio”, there does indeed appear to be a strong correlation to both Goal Ratio and Points per Game.


What about the differential then? Stronger or weaker correlation? Ask no more!



It appears that the correlation is slightly weaker, both to Goal Ratio and Points per Game, for Big Chance Differential than for the Big Chance Ratio. As an aside, the differential actually had a tiny bit stronger correlation to goal difference compared to the ratio.

So we´ve found some strong correlations, but how do these correlations compare what´s already out there? Here are the scatter plots for the simple, but surprisingly powerful Shots on Target Ratio.


The Shots on Target Ratio does a better job explaining what has happend than the Big Chance Ratio. I have to say it does remarkably well, explaining over 81% of the variation in Goals Ratio and just under 78% of the variation in Points per game. For such a simple measure, those numbers are huge. In fact, the SoTR does just as good a job as my ExpG-model in explaining what has happened when looking at all five leagues together.

ExpGDvsGR expGDvsPPG

Now in the models defense, it does have the strongest correlation to goal difference, and it does better than the SoTR in select countries as it was built on English shots only, and compared to the measures that are using Big Chances, the model still  wins out. I will say though that the universality, simplicity and strength of the very basic SoTR is actually a bit surprising to me. The SoTR has a lot of things going for it.



So the ”Big Chance”-measures does a good job of explaining what has happened, but a slightly worse one than both the Expected Goal Ratio and the Shots on Target Ratio. Now lets have a look at the repeatability of the metric. How does the performance in BCR and BCD transfer from season to season, and how does it compare with the other metrics already out there? To try and figure this out I plotted the teams performance in each of the Big Chance-metrics in the first season on the X axis, and the performance in 2014/2015 in the Y axis. Since some teams were relegated and promoted in 2013/2014, the sample size is getting even smaller here, which is something worth having in mind.


There are still some nice correlations here, meaning that the Big Chance-measures seem to be repeatable to a decent extent, with the differential beating the ratio this time around, explaining about 6% more of the future variation in the same metric, although I wouldn´t focus too much on differences that small given the sample size.

However, as one would intuitively think, both the SoTR and the ExpGR seem even more repeatable from season to season. Given that there are almost three times as many shots on target than there are big chances, and that the ExpG ratio uses every single shot, it´s not very surprising that these measures are repeatable to a greater degree, as the variation decreases as the sample size increases. Funnily enough, the SoTR and the ExpGR predict themselves at an almost identical rate in this sample.



Preditctive power:

Last but not least, lets examine the predictive power of our BC-measures. How well does the performance in BCR in season 1 (2013/2014), predict the actual outcome in Goal Ratio in season two (2014/2015). The story is pretty similar to what we´ve observed already, with the BCR doing a decent enough job, but the SoTR and the ExpGR doing an even better one, at very similar levels.


To summarize, we´ve learned that the Big Chance measures correlate quite strongly with goal difference and points won, and that it is reasonably repeatable from season to season. However, as a lone metric for describing team strength and predicting future performance, it doesn´t beat what´s already out there, and shouldn´t be used as such.

With that said, I wont be giving up on the BC measures just yet. Given that the usual shot ratios, TSR and SoTR, only conveys information about shot volume and no information about shot quality, whereas the BCR only handles high quality chances, I feel like there should be some use for the BC measures in some sort of composite team rating, using either the SoTR or the ExpGR (or both) to supplement the BCR or BCD. That´s definitely something I´ll be looking more in depth at in the future, but that´s one for another day. A post can only contain so many graphs right?

Taggad , , , , , , ,


Fyll i dina uppgifter nedan eller klicka på en ikon för att logga in: Logo

Du kommenterar med ditt Logga ut /  Ändra )


Du kommenterar med ditt Google+-konto. Logga ut /  Ändra )


Du kommenterar med ditt Twitter-konto. Logga ut /  Ändra )


Du kommenterar med ditt Facebook-konto. Logga ut /  Ändra )

Ansluter till %s

%d bloggare gillar detta: