Thursday, 14 August 2014

Fantasy Premier League: An alternate approach

After making sure that we've got the basics right, one needs to take FPL gameplay to the next level and win some leagues. Several FPL players already spend a good amount of time each week scouring the internet for news, following managers' every word, scanning fixture lists and looking at the wealth of player statistics that is now available to make good choices for the team. There are some good resources to track price changes/bonus points.

I want to try a more direct approach. From start to finish, there were 691 players listed at any given time in FPL last season (this includes players who left after the season started, like Gareth Bale). Inspired by Bill's post on mining and analysing FPL data, I went ahead and got the players' numbers in formats that I found comfortable, and did some basic crunching. I won't bore you with programming details, but I will share some fascinating results.

Strong and weak teams

We all have a general idea about strong and weak teams from following the game, but let's look at things from the points perspective. I use two approaches to finding out where the points are going to come from - points scored and points allowed.

First up is points allowed. Finding the FPL points a team allows each player of the opposition over 90 minutes on average and sorting them in descending order gives something that looks like:

Average fantasy points allowed per player per 90 minutes

You will note that this bears an eerie resemblance to the Premier League table (in reverse) of last season. This drives home the points that the opposition and how generous they are is important - Cardiff are almost twice as generous as Manchester City on average.

Of course, the average isn't everything. Teams perform differently home and away, and some teams' home and away performances aren't that far from their average (Aston Villa, Manchester United).

Fantasy points allowed per player per 90 mins - Home, Away and Average

Of course, there is also the points scored by teams over 90 minutes per player:

Average fantasy points scored per player per 90 minutes

Average fantasy points scored per player per 90 minutes - Home, Away and Average

It's debatable whether micromanaging in this way is good, as going into too much depth runs the risk of overfitting. It is probably best to ignore exceptions and consider the average along with a home/away factor as the curves run more or less along the same paths. Fitting a linear model to the numbers yields the following relations with R-squared values over 89 and randomly scattered residual plots (Meaning the relations are more or less reliable):

score_home = 1.11293*score_avg - 0.09057
score_away = 0.88620*score_avg + 0.09291

allow_home = 0.9961*allow_avg - 0.3220
allow_away = 1.0037*allow_avg + 0.3226

I feel it best to stick to the average and twist it as per home/away game rather than use separate home/away numbers to determine the teams to pick the points from. The fixtures that FPL players should focus on is a combination of both scoring and allowing. The ideal team to select players from is one that scores well and is facing a leaky opponent.

Keep in mind, however, that 'leakiness' in this context is determined much more by the opposition's lack of ability to score than how many they concede. Goal points only go to the scorer and, if applicable, the assister. A clean sheet adds 4 points to all the defenders and keepers, as well as an extra point for the midfielders. So merely failing to score dishes out a bigger total to the opposition than conceding 3.

And that makes the metric... not too helpful. To make some really good choices, we need to know the friendliness of opposition towards particular player positions, as well as the share of a team's points scored by position. More on this in the next post.

(It is essential to play the season with recent data, so the numbers need to be progressively monitored as the season goes on without completely ignoring how the players did last season. Also, the 2013-14 data can't be accessed through the FPL API anymore following their update, so if you want the numbers and want to experiment, help yourself. If you want data organized differently, drop me a line and I'll see what I can do.)