I get loads of queries asking for xG values for teams or players. When I started all this nearly 5 years ago, collecting enough shot position data to build a model took months worth of effort. Today, you can get a decent enough equivalent in 20 minutes for a ton of leagues.
How?
Have you got a Excel? If not sign up to Google Sheets for free. The only other thing you need is to go to Whoscored.
Open up Whoscored and go to the Premier League section
Click on the ‘Team Statistics’ tab
Change the drop down to season 2015/16
Under the ‘Premier League Team Statistics’ heading you’ll see the ‘Detailed’ tab. Click it.
It should default to ‘Shots’, ‘Zones’ and ‘Per Game’ in the dropdowns. Change ‘Per Game to Totals’
Highlight the table as shown below and copy it:
Open your spreadsheet. Paste it in, then add some headings something like this:
Back on Whoscored, change the ‘Shots’ dropdown to ‘Goals’. Cut n paste the data in the same way into another worksheet. When you come to label the columns you’ll see that the Out of box, Six yard box and Penalty Area are in a different order to the ‘Shots’ data. You need to rectify this in your spreadsheet so both you shots worksheet and goals worksheet are in the same order.
You need to repeat this process for each season to give yourself enough data to get a good sample size. I went 5 seasons back in a few minutes. Do it all in the same worksheets. Do 5 seasons and you should have 101 rows of data in each of your shot and goals worksheet.
If you add up the relevant columns in each worksheet you can then work out how often shots from each of the three locations are converted into goals. It should look something like this:
As you can see on the screenshot I’ve applied the model to the shots Pogba and Aguero have taken to get a total xG figure. Click into any player’s ‘Detailed’ on Whoscored and you can find there shot breakdown (just remember to turn the dropdown to ‘Total’ rather than ‘Per Game’)
As I say, you can repeat this for any league you want that Whoscored carries detailed data for. Ok, so it’s a really basic xG model, but it will be in the right ballpark. And remember, it took me months to do something similar when I first created SPAM, so just be grateful!
And you never EVER need to ask me for xG data again x
Follow me on twitter @footballfactman
*UPDATE*
This is a pretty good idea:
Change the sub category to ‘Situations’ to find penalty data so you can alter accordingly…
8 responses to “An xG Model for Everyone in 20 minutes (ish)”
[…] concept expected goals (xG). Voor de dertig bekeken spelers heb ik de methodologie toegepast die hier beschreven staat. De beperkte dataset is een verbetering, aangezien in het origineel de missers van […]
[…] “An xG Model for Everyone in 20 minutes (ish)” by Paul Riley […]
[…] Apart from the new contestants, I also decided to include Paul Riley’s model from his blog ‘An xG model for everyone in 20 minutes’ to see how it performed. For an explanation of how the ‘perfect’ model is created, please read […]
[…] by Paul Riley, An xG Model for Everyone in 20 minutes (ish), , posted on a differentgame blog April 29, 2017 that describes how it is possible to use the data available from the three pitch grids of […]
[…] An expected goals model in 20 minutes (ish) – what I used to create my xG model for the Championship […]
[…] data analytics can be really tricky and intimidating, so that is why I am so grateful for this post by Paul Riley( @footballfactman on twitter). So pretty much to start off my experience with data, I […]
[…] Riley’s xG model: https://differentgame.wordpress.com/2017/04/29/an-xg-model-for-everyone-in-20-minutes-ish/ WhoScored: […]
[…] post (and it’s title) was inspired by this post, one of the first that I came across when I first took an interest in football analytics. I […]