It’s probably best if I explain the ‘why’ of this piece with a series of old tweets. If you like the relentless clapping of a demented seal, then please follow me on said twitter medium:
Scenario: the manager now wants a central midfielder that is going to want to get on the ball and has a good range of passing – not just someone who will play it safe. Central midfield has been a problem for too long.
So, I’m a data guy. I think data can help football scouting. A lot. I also wouldn’t buy a player without watching him. A lot.
Going to watch footballers is an expensive, time consuming business. Premier League teams might have contacts in several countries around Europe (and maybe South America) who flag up players of interest for scouts to follow up. Bigger Championship teams may have similar. But in general, I don’t think physical scouting networks are as extensive as fans may think.
Most clubs have ‘technical’ scouts now too. These guys have access to video footage from across the globe through platforms such as Wyscout. It’s transformed the business in a lot of ways due to its sheer scope. The football world is a smaller place because of it.
But with so many players to watch, time is still at a premium. Technical scouts also have access to data. Raw football data is not cheap for a club to buy. It also needs specialist knowledge to turn it into insight. This costs more time and money. So a lot of the time the data is packaged to clubs but not in a particularly useful way.
With the raw data you can build your own passing model. Unlike your human scout, it can analyse ALL the passes each player made. Detailed information like where it was from, where it went, who it went to. It’s more nuanced than pass completion % or final third completion.
Better still, it applies the same rules to everyone. There’s no bias towards players that you get from human eyes that like what they’re seeing.
So I’ve applied the model to the following leagues across Europe for the last 18 months: Premier League, Championship, Bundesliga, Bundesliga II, La Liga, Serie A, French Ligue 1, Dutch Eredivisie, Portuguese Liga NOS, Turkish Super Lig, Russian Premier League and Swedish Allsvenskan.
If you filter the model so it picks out the players who pass in similar patterns to the likes of Toni Kroos, Luka Modric, Cesc Fabregas etc it throws up well over 200 players across these leagues. Rooney this year, is one of them. And the model agrees with you that he’s not done it nearly as well as those three. You think that a lot of this simply comes down to his physical decline.
The manager wants someone who can do what Rooney does, but who’s a lot younger with the potential to get better. But the kid has to already have ‘done it’ and played that role. He’s got to have had the minutes on the pitch and the volume of passes under his belt to prove it. He needs to be part of the first team squad from the off.
So you filter again for players no older than 25 that have sprayed the ball about from midfield in the last two seasons. And this is the 38 players my model chucks up (Club column is who currently owns the player):
Some of these guys are household names that Everton aren’t going to get. They’re already at Champions League clubs or have already secured moves to one ahead of the summer. Several are already on loan to bigger clubs who have options to buy. Some are in leagues you’re not sure you can trust.
The rest your scouts can go and get some eyes on. The data’s saved the club some time and money by focussing the men on the ground.
This blog is dedicated to the many people who have said the following to me: “Who do you like? Who should we be after?”. The honest answer is I rarely watch football outside the Prem these days so I don’t know anymore.
But I know a spreadsheet that does.