Player Clustering - Advanced Defence v Position

| August 23, 2020

Defence v Position

Defence vs Position (or DVP) is a common tool used throughout the Fantasy Sports Industry. The tool has varying levels of importance depending on which sport you are playing. Sports like NBA or NFL lend themselves very well to DVP as there are specific one-on-one matchups that are predictable from one game to the next and player’s roles are highly consistent. These sports also have a wealth of statistical data made public which ensures accuracy of models.

AFL is different, players rotate through a variety of different positions in a single game, especially midfielders who may play forward to differing degrees from game-to-game. The AFL also have a habit of keeping the more interesting stats behind lock-and-key ensuring the general public only have the base stats to work with. Because of this, service providers in Australia have typically worked with positional categories supplied by the fantasy platforms themselves when developing their DVP tools. Here at DFSAustralia we’ve historically reverted to the official AFL Player Rating Positions (manually adjusted from time to time) to try to be as accurate as possible. However what if there was a better way?

Not All Defenders Are Built Equal

Grouping players of the same position based purely on their positional classification is easy to do but isn’t optimal. A rebounding defender like Jayden Short or Bachar Houli is going to receive a different DVP boost than a less attacking defender like Noah Balta in the same game purely because of their playing styles and how the opposition play. Knowing this, we attempted to generate custom positions via means of machine learning based on the publicly available data.

Custom Positional Grouping

We extracted all the player stats between the start of the 2019 season up until the completion of 2020 round 11 and removed players who had averaged less than 65 fantasy points over that time. There was two main reasons for this (1) to decrease the number of data points as this type of analysis can take quite some time to run and (2) to focus on the players more likely to be selected in your fantasy sides. We then further filtered the data to consider only the last 20 games of each player and removed any players who had played less than 10 games over that time.

With our player game logs selected, we then selected the individual player stats to be considered in this analysis. I was particulary keen on including stats like rebound 50s, marks inside 50, centre clearances, etc, stats that point to a location on the field at a set time as opposed to straight disposals, etc which can occur anywhere. In the end, the stats that I included in the grouping exercise were as follows (in alphabetical order):

  • bounces
  • contested marks
  • contested possessions
  • goal assists
  • handballs
  • inside 50s
  • intercepts
  • kicks
  • marks
  • marks inside 50
  • metresGained
  • rebounds 50s
  • shots at goal
  • tackles
  • tackles inside 50
  • uncontested possessions
  • centre clearances
  • stoppage clearances
  • ruck contests
  • centre bounce attendances

Once the stats were organised and normalized we ran these through what is called a k-means clustering algorithm, which put simply, groups the players based on similarities across their statistical categories. The output of this type of machine learning is a cluster diagram included below.

In this case the analysis identified 6 separate clusters of players, I’ve described these as follows:

  • Ruck - this one is fairly simple and is really no different to existing classifications. These are the players who attend ruck contests. Easy.
  • Forward - this group consists of your classic tall forward types in Ben Brown, Tom Hawkins aswell as players like Isaac Heeney & Brad Ebert who play predominantly forward although these two are on the boundary between this category and the next.
  • Forward / Midfielder - players who play both forward and midfield or players who push forward more as opposed to defensive running from stoppage. Players like Robbie Gray, Brandan Parfitt and Darcy Parish typify this group.
  • Inside Midfielder - this is your typical centre bounce attendance, contested ball beast or high possession getting midfielder. Players of the ilk of Patrick Dangerfield, Taylor Adams and Tom Mitchell.
  • Outside Midfielder - contains your midfielders who tend to do their damage with uncontested possessions instead of on the inside. Isaac Smith, Ed Langdon, Karl Amon, etc. Note that this section also includes those running half backs that typically get up the ground and get plenty of pill themselves, eg Brodie Smith, Bachar Houli, Jack Crisp, etc
  • Attacking Defender - just as the title says, players who predominantly stay down back and get the majority of their ball in the defensive half.

Note that for this exercise we didn’t give any extra weighting for more recent games and all games were treated equally. Because of this, players who have recently had a role change will over time move from one category to the next once their older games move out of the data pool.

Defence v Custom Positional Grouping

Once we identified the classifications we summarised up the performance of these player groups against each opponent over that time frame. Below is the breakdown of the results in terms of plus/minus. We here at DFSAustralia prefer to use plus/minus when refering to DVP instead of total points scored against, it’s a much better metric (I’ll leave the explaination as to why for another day). Simply put, plus/minus is the difference between how many points a player has scored vs their season average (for example, if someone scores 90 and their season average is 80 then they have scored +10). A positive plus/minus means that team has allowed players to outperform and vice-versa.