December 18, 2015

A New Way to Look at Transfers, Teams and Their Environments?

Felipe Anderson, Sadio Mane, Antoine Griezmann, Jamie Vardy, Romelu Lukaku, Riyad Mahrez. These are names which fans around the world are becoming even more familiar with this season. Due to their glowing performances, football fans of other clubs covet these players and find rationalizations for why any of these players should be in their team. However, it seems as football fans, there are many factors we are leaving out.

What we wish to explore today is taking a deeper look into what it means to sign a high performer at another club and why it may not be a good idea at all.


Usually, when the football community witnesses a high performer maintaining good form over a period of time, the discussion starts about which club the player would fit best. It becomes an ongoing competition and fundamentally the logic goes as follows:

“Player X performs well at Team A and therefore, will fill a need at Team B” (assumptive example)

On the surface, it’s an attractive way to think. You see a player you fancy and his qualities seemingly address a need in your team. Why not advocate his positives to others? The question hits precisely at the problem, however.

Most of the time when fans are looking at players in greener pastures, they are observing the positives and in some ways, completely ignoring the negatives. This is how you get a “Torres to Chelsea” situation where the expectations are high and the margin of error is low. You could also include Juan Sebastian Veron’s move to Manchester United as another example of what’s been discussed so far.

The Problem:

Clearly the assumptive example above is not good enough and we need a more expansive way of thinking which incorporates more variables into the equation. From a systematic point of view, let’s first break down why the assumptive example doesn’t work.

Suppose we have 2 systems both resembling teams (Team A and Team B, respectively). The system is made up of components, in our case, the players. Using the assumptive example above, if we transfer 1 player (player X) from Team A to Team B, we are assuming the player will have similar impact at Team B as he did at Team A. Furthermore, stating the player (a variable) carries the same weight in Team B as he did in Team A and also assuming the rest of the variables (players) remain unchanged and also carry the same weight.

This way of thinking is incorrect for several reasons:

  • Football is a dynamic sport, therefore making one small change can have a significant impact on the flow of the game. An example would be Sir Alex Ferguson substituting on Javier Hernandez or Ole Gunnar Solskjaer hoping to nick a goal late on in the game. Both of these players have their own contribution which affects their teammates and opposition because now, they have a new situation to react to.
  • Since football is a dynamic sport it is wrong to assume Team B is increasing its overall strength by the same factor player X contributed to Team A because that assumes ceteris paribus is upheld (all things held constant). In most cases, players jump from one environment to another which share similarities and differences but are rarely ever the same.
  • It fails to account for various aspects, such as: interactions with other players, level of pressure at new club, level of expectations, tactical environment(role/position), fit with new manager, integration period and overall interaction between these aspects (e.g Angel Di Maria and Louis van Gaal)

New Model:

I spoke with @TheBerimbolo about how we can better model this phenomenon. At first, my original idea was viewing it in the context of the transitive property in math. Then @TheBerimbolo pointed out it was not the correct usage of transitive property and proposed something else.

Below is a general description of a general linear model. More specifically, this regression analysis helps one understand how the typical value of the dependent variable (or 'criterion variable') changes when any one of the independent variables is varied, while the other independent variables are held fixed.

We are using this model for the sake of simplicity as a means of basic explanation and will seek to improve it with better accuracy and precision later on down the line.

Y1 represents the output of Team A’s success (total contribution from sum of players, type of manager, tactics and overall environment) and Y2 represents the output of Team B’s success (same factors as Team A).

Xi represents the player as a variable within the system

Ai represents Team A’s regression coefficient tied to Xi

Bi represents Team B’s regression coefficient tied to Xi

Basic equation

Y1 = A1X1 + A2X2 + A3X3 +………..+ AnXn

Y2 = B1X1 + B2X2 + B3X3 +………..+ BnXn

Equation accounting for interaction terms

Interaction Term - measures the relative effect of two or more factors working together towards the team's overall success

Subscript j for additional factor in interaction term

Subscript k for interaction term related to regression coefficient (A/B)

Y1 = A1X1 + A2X2 + A3X3 +………..+ AnXn +………..+ Ak(Xi*Xj) where k!= i != j

Y2 = B1X1 + B2X2 + B3X3 +………..+ AnXn +………..+ Bk(Xi*Xj) where k!= i != j

Interaction term example

*in this example, assume Team A = Manchester United

Y = A(Ander) + A1(Ander*Mata)  A, A1 > 0

The above equation is saying Ander Herrera by himself makes the team better and so does the combination play between himself and Juan Mata. It allows us to pick out key partnerships within a team when measuring on-field performance and measure how much they contribute to the overall success of the team.

There is not a term for Mata because the other two terms are assumed to be larger contributions to the team’s success overall which makes the individual term for Mata negligent. It would look different if Manchester United were built around Mata but that is unfortunately not the case.

An additional example would be:

Y= A(Ozil) + A1(Ozil * Sanchez) + A2(Sanchez) A,A1,A2 > 0

Key Points:

  • When a player from Team A transfers to Team B, fans tend to assume the player’s relative contribution to Team B’s success remains the same. This means a variable (Xi) is added to the initial equation for Team B but that same variable or player in this case, keeps the same regression coefficient (Ai) from Team A. We addressed earlier why this isn’t the right approach. 
  • By using Ai instead of Bi, we are ignoring the integration effects which are related to Team B’s environment. Therefore, fans are making value judgments on player X’s contribution to Team A but expecting them to materialize in Team B’s environment when Team B’s environment is different than Team A’s environment.
  • To further drive the point home, in most cases we expect Ai != Bi which again highlights the environment between the two teams is not as comparable as we would like to think. Depending on how player X adjusts to Team B, Bi could have a small or large positive/negative value. Or it roughly stays the same which assumes a relatively smooth transition period, something Manchester United don’t know anything about recently.
  • For the Interaction Term example, we wish to remind the example captures Herrera and Mata’s relative effect in combination not in the sum of their individual effects.
    In other words: Ander*Mata > Ander + Mata

Final Remarks:

One phenomenon we've seen often as Manchester United fans is a comparison between how players performed while under Louis van Gaal and when they left. It just so happens with namely Javier Hernandez and Angel Di Maria, they have found clubs which suited them better than the environment they were under at Manchester United.

However this does not mean in the case of Javier Hernandez, that Manchester United were wrong to sell him.  Hernandez was looking for regular football and Louis van Gaal couldn't promise that with Wayne Rooney as his captain and set to lead the line at the 9 role.

Fortunately for Hernandez, he found a club which suited his skills and also provides a stable environment for him to enjoy regular playing time.  Manchester United fans are quick to let you know how well it's going for him and some assume if he was at Old Trafford, he would be doing the same. 

Based on what we've discussed in this article and how van Gaal likes his United team to play, we argue against this type of logic.  There is no guarantee he would perform any better, worse or par if he was back at Manchester United.  Based on the balance of probabilities and what we know now about van Gaal's tenure, it's safe to say he would most likely be performing worse than he is at Bayer Leverkusen.

This is our first step into measuring how a player’s performance translates from one environment (club) to the next. We recognize we can use more sophisticated modeling to improve accuracy and precision. However, our priority at this moment in time is to provide a basic introduction into a topic we find could be very interesting in the realm of data analytics. Thanks for reading and we hope you look out for our additional content!

Follow me at @EddieTrulyReds and my co-author @TheBerimbolo

No comments:

Post a Comment