Data Modelling


Expected Goals Model

Much of my modelling surrounds the topic of ‘expected goals’ and ‘expected assists’. Essentially, the models use historic data to go beyond basic goals and assists to find the true, long-term value of player actions.

Expected Goal Model explanation

Model uses and application

Player forecasting with the Model


Chance Creation Model

My chance creation model builds on the original SPAM model (below) by taking not only where the shot was taken from into account, but also how the ball was delivered to the attacker. Was he teed up from inside the box itself, from a cross out wide or from a long ball down the middle? In all, 30 different types of chance are accounted for including whether the final ball was from open play or a set piece.

The beauty of the Chance Creation Model is that it allows me to measure not just goalscorers on their shooting ability and decision making, but also the effectiveness of creative players within a team set-up. My piece on Juan Mata was featured on the OptaPro website.



After analysing over 30,000 English Premier League (EPL) shots over the last 3 seasons I was able to put a number on the average amount of shots it takes to score a goal from various positions on the pitch.

I called it the Shot Position Average Model (SPAM) and it was my first foray into football data. The OptaPro website has included SPAM in its showcase of the best in sports analytics writing from around the web.

I believe in simplicity. The graphic below shows the results of my study. This is how many shots it takes to score from these areas of the pitch in the EPL:


My study shows the numbers have been almost exactly the same in each of the last three seasons in each of these five areas:

  • Penalties
  • Direct Free Kicks
  • Inside Box Central
  • Inside Box Wide
  • Outside Box Open Play

The sheer consistency of the model gives me the confidence to state that the benefit of having the specific x,y data so coveted would be outweighed by the loss in trying to practically apply it within the game itself, or inform the fans watching.

Within a year of SPAMs publication, the football analytics community had expanded rapidly to the point where today, similar principles are used in articles in The Guardian, the New Yorker and The Washington Post.

31 Responses to Data Modelling

  1. Pingback: Bale v Walcott – The Real Story | differentgame

  2. Pingback: Manchester United – Conversion Kings | differentgame

  3. Pingback: The SPAM Player of the Season | differentgame

  4. Pingback: Introduction to Soccer Analytics – The Guys I Follow | Mixed kNuts

  5. Pingback: A Study Of the First 1000 Shots in the EPL This Year | differentgame

  6. Pingback: Squinting at Ink Blots: On Spurs Superior Defense | Tactical Strikes

  7. Pingback: Where the shooting gets wide of the mark | differentgame

  8. Pingback: Team Styles and Quality of Shot Position | differentgame

  9. Pingback: Two halves of Manchester United’s Defensive season | thebubblegame

  10. Pingback: Taking Chances, Taking a Step Back | differentgame

  11. Pingback: Is scoring ability maintained from season to season? (slight return) | 2+2=11

  12. Pingback: StatsBomb | “Shooooot!” A Paradigm Shift in How We Watch Football

  13. Pingback: Liverpool’s Chance Quality | Bass Tuned To Red

  14. Pingback: StatsBomb | Stoke City’s New Look – the Long and Short of it

  15. Pingback: Second Half Slump? Controlling The Result | Bass Tuned To Red

  16. Pingback: A Big Week in Analytics? | differentgame

  17. Pingback: Manchester United and David Moyed Attacking at Home | The Away Strip

  18. Pingback: A Rough Measure of Shot Quality in Soccer | Jalnichols Blog

  19. Pingback: Manchester City Attacking Stats | The Away Strip

  20. Pingback: Manchester City vs Barcelona - A Video Analysis | The Away Strip

  21. Pingback: Wayne Rooney – A Perfect 10? Part Two | differentgame

  22. Pingback: Sabermetrica e calcio scommesse: quando la matematica aiuta a puntare - Scientificast

  23. Pingback: A New TSR Metric – DZTSR – Sideline Team Talk

  24. Pingback: Det finns inga mirakel att köpa loss |

  25. Pingback: A TSR Variation – DZTSR – CHANCE ANALYTICS

  26. When someone writes an post he/she retains the idea of a user in his/her brain that how a user can be aware of it. So that’s why this article is amazing. Thanks!|

  27. Adam says:

    Couldnt you go further than xA and do xHA as in hockey assist? I think this is a very underrated stat in all other sports besides ice hockey. The guy who assists the assister would make it easier to track the ball entry in to xG area.

  28. Pingback: Una Variación del Coeficiente Total de Tiros (CTT) – CTTZP | CHANCE ANALÍTICA

  29. Pingback: Data Sandbox #8: Expected Passes – An Introduction | Trivela & Rabona

  30. Pingback: An Introduction to Expected Saves (xS) | CHANCE ANALYTICS

  31. Pingback: My journey into expected goals (xG) – the way we play

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s