I used a sample of tracking data from Metrica sports and resources from Laurie Shaw (@EightyFivePoint) to build a model to identify and estimate the value added of off the ball runs (tracking data records a player’s position on the pitch at frequent intervals).
Why is this useful?
It’s main use would be to identify which players offer the most passing options, and how valuable these passing options are. For attacking players, this can be used to identify which players are getting in the best positions, even if they don’t receive a pass or cross: this could be useful for identifying attacking players from smaller teams who could make the step up to a bigger team. These statistics would also be easy to combine with video scouting, as they make it easier to pick out ‘good’ bits of play. It could also be used in opposition scouting, to identify what types of runs opposition players make.
Rather than try and explain why this might be useful, here are some clips of runs made by players which I would like my method of identifying off the ball runs to pick up. These runs both look good because they offer a passing option into the penalty area. Whilst looking at the number of passes received in the penalty area can credit players for making these runs, this model would still credit the players making the run even if they didn’t receive the pass.
Identifying off the ball runs
I was only interested in off the ball runs which happen when the attacking team is in possession. I looked at runs made just before a pass is played. I counted a player as making an off the ball run if they are running at a speed greater than 4 m/s for at least 1 second in the 2 second window before a pass is made. I chose these thresholds because they seemed to identify runs which cover a noticeable amount of distance, and they gave enough runs to look at. However, these thresholds are quite arbitrary and could be changed. During some sequences of possession, multiple passes may happen in a short space of time so a pass may happen less than 2 seconds after the last pass was played. If there’s any overlap in any of the 2 second windows before a pass is played, I merge these time periods together, to avoid double counting runs. Of these runs, I only counted forward runs where the pass played and end point of the run are in the opposition half.
Estimating the value added of an off the ball run
I estimated the value added of a run based on two different models, pitch control and expected possession value (epv). I didn’t make these models myself, but used Laurie Shaw’s code (links at the end). Pitch control estimates the probability that a team will keep the ball (based on position and velocities of the players) if it’s passed to a certain area of the pitch. Expected possession value is the probability that a possession will end in a goal, depending on where the ball is. Multiplying epv by the pitch control at a certain location balances the reward of passing the ball to a different location with the risk of losing the ball. To estimate the value added of an off the ball run, I multiplied the pitch control by the expected possession value at the start and end location of the run, and compared the difference.
The pitch control is calculated at the time the pass was played. If the time period has more than one pass, the pass which is closest to the end time of the player’s run is chosen. I determine the end location of a player’s run by their location at the time shortly after the pass was played, as this gives a likely position of where the player may have received the ball if it was passed to him/in his direction. With the value added statistics, it makes sense to only compare these for players in the same position, as a run a striker makes into the penalty area to connect with a cross will have a much greater value added than a midfielder bursting out of his penalty area to offer options on a counter attack. Therefore, one may only want to count runs in the opposition half for strikers, and count runs all over the pitch for midfielders, to reflect where they perform most of their actions/runs.
To put the value added statistic into context, the below clip contains 3 of the 5 runs with the highest value added, by players 9,13 and 12:
The value added by the runs from players 9,13 and 12 are 7.3%,6.3% and 3.9%, respectively. These are some of the highest figures as they’re all runs into the penalty area. However, the average value added of all off the ball runs is only 0.8%. This is because expected possession value is very low and quite similar in most areas of the pitch before the penalty area. This is why these stats should only be compared within players playing in the same position. Another way the runs could be analysed which removes this limitation of the epv model is to just look at the vertical distance travelled by the player in their off ball run.
Examples from the data
Here are some examples from the data which my model counts as off the ball runs. The run below is by the red number 7:
This run is good because it creates a lot of space. This model to evaluate off the ball runs could be paired with work done by Andrew Puopolo (@andrew_puopolo) which measures the space created by player runs: https://github.com/andrewsimplebet/FoT-Player-Pitch-Control-Impact
This run is by the red number 5:
I like this example as it captures a third-man run.
The main assumption this model makes is that the defenders would have acted the same way had the player not made that run. This probably isn’t true, but shouldn’t make too big a difference to the pitch control at the players’ start location, and it doesn’t affect the expected possession value calculation. As I’ve done this using a small sample of anonymized data, I can’t identify which players would score highly for off the ball runs and value added, so I can’t perform a ‘common sense’ check of my model. Better event data would make it easier to identify more off the ball runs. For example, the event data used doesn’t record dribbles or carries, so I can’t see if a player is offering a passing option by making a run when their teammate is dribbling with the ball. Another consequence is that sometimes a player dribbling/carrying the ball is actually classified as an off the ball run.
I’ve watched clips of a lot of the runs picked up by the model, and a lot of them are uninteresting and don’t offer much value (another reason why the average value added is so low). This may be because of reasons like the player is moving up the pitch with his team and he is too far away from the ball to count as a viable passing option. To highlight more interesting runs, runs with value added above a certain threshold could be considered as a ‘progressive’ off ball run, and only looking at these could filter out a lot of uninteresting runs.
I think the method I used to identify and value off the ball runs would be useful in a player recruitment and opposition analysis context, as shown by some of the examples I’ve included. However, there are definitely areas for improvement or other ways to assess the off ball runs, e.g., finding a way to filter out the low value uninteresting runs and looking at yards gained by the run.
Metrica tracking data: https://github.com/metrica-sports/sample-data
Laurie Shaw’s code for pitch control & epv model: https://github.com/Friends-of-Tracking-Data-FoTD/LaurieOnTracking
Note: To extrapolate the end location of a player’s run, I calculate the average time taken to make a pass by that player’s team, then see what their location is that amount of time after they stop running at the required speed.