Sunday, 24 December 2017

Half Season Premier League shot analysis



Hard to believe it is just a month since I wrote a 12 game review. The crazy Premier League schedule continues, unrelenting in the face of cramming a season in before the World Cup. As such, there is a literally a window of today or tomorrow (might be busy) to write this, so once more you get a quick scree with some light jokes, snark and stats. Same deal, we're looking at seasons from 2009-10 onwards, because that's where you can publicly access Opta shot data. So every ranking is out of 180, which is just about large enough to make the outliers interesting.

Also huge criticism of the Premier League is deserved for scheduling the week 3 games to be the same as the week 19 games. This means that Liverpool and Arsenal have played twice already and we've not even started the second half of the season. Any analyst worth their salt can join me in shaking their head at this thoughtless scheduling. What a jerk move.

A habitual aside on expected goals. I think generally people experienced in the field are unconvinced about the reliability of any expected goals model before somewhere between 2011-2013 (depending on taste). As ever, it's easy to say "you're talking about shots, but where were they taken from" and feel smug--which is the oldest criticism against pure shot analysis there is--but you'd be missing some of the power of what shot analysis does provide. We can go back nine seasons for the benefit of this analysis, and well, if you want more detail, you're welcome to delve in yourselves! Volume is still the primary indicator, so let's see what we can find out.

Man City

Not to labour a point but everything starts with this ridiculous City side. They were 11-1-0 at the 12 game mark and now they are 18-1-0. This significantly separates them from every team on the list, apart from perhaps last year's Chelsea side who were 16-1-2 at this point and similarly riding high off a ludicrous win streak. They didn't have the bravura attack of this City side so didn't get the same superlatives. Defence was their bedrock, and sure enough they converted from here. City will convert too and they have an attack for the ages to meld to their defence. They could actually be genuinely quite bad for the rest of the season and still sail in to the title, but there's little sign that that will happen.

So how are City good?
  • Tied =1st/180 for shots on target (7.7/game)
  • 1st in shots against (6.1/game, margin is +1.5, Liverpool this season is second)
  • 1st in shots on target against (2.0 per game, +0.5)
  • 1st in total shot ratio: 74.6% (about 5% ahead of any other team, again Liverpool 2017-18 is second)
  • 1st in shot on target ratio: 79.3% (over 6% ahead of any other team)
All this feeds into this lot:
  • 1st/180 in goals scored after 19 games (60)
  • 1st in goals against (12)
  • 1st in shots on target per shot (42.8%, the kinda metric that could cool)
  • 1st in goals per shot (17.6%, 1.5% ahead of anyone else, and again usually the kind of metric you'd expect to regress/revert/cool)
So the perfect storm of rock solid structural metrics, and a smidgen of red hot conversion. It's too early to get the bunting out and talk "Invincibles" (not that it's stopped anyone...) but City have put together probably the best half season of league football this decade, century, ever? I don't know, but it's been electric stuff and for the benefit of this analysis, City is best. Rotation (as I wrote here) and easing off to pursue other goals may be the two main factors that affect any quest for perfection but they really will want to stay in the Champions League or it could be cigars and dreariness by mid-April, y'know a bit "Bayern under Pep". The perils of being too good, huh?

Man Utd

The surface outlook of Man Utd's season is pretty good. They're second in the league, well qualified for the Champions League knockouts and only ever lose at Old Trafford to Man City. However, underneath the glossy exterior is a whole world of potential trouble, at least there will be if they don't fundamentally improve. For an extended take, see my report at The Ringer, but the bottom line is this is a team that is taking 52% of the shots in its games and... wait for it... that's fewer at this point than the Louis van Gaal years (55 and 54% at this stage). I and many others gave them shit for not being good enough then, so why on earth would I stop now?

What's the matter? Last season United were taking 64% of the shots in games, but had been stymied by getting stuck drawing a ton of games at home and battering on the door to no avail. Hence the wealth of shots. This season, they've taken the lead a lot and have played more pragmatically when leading, and this has moved the needle downwards.

So, under Mourinho, they have played hard and turned it on when needed and throttled right down when they didn't? That's fine, right? Well, yes and no. If we scrappily take a mid-point between what we're seeing when things go Utd's way (we will get to this, they really have) and when they don't (this time last season), we find they spec out as a fringe top four side. 

There is a tension between two things at the back:

1. David De Gea is great
2. No human alive can long term keep the ball out of the net at the rate he has this season

He's saved over 83% of shots on target he's faced this season which is around 3% ahead of any other half prior to this year, so 1st of 180 (Burnley are on around 82% themselves, it's a miracle!) Also the full season record is ~79% for a couple of Ferguson Utd teams (2009-10 and 2011-12) so on balance, you'd expect that to be the highest they could get to (Burnley too!) as the season progresses. What does that mean in real terms? It's far more likely that Utd (and Burnley!) keep the ball out of the net at more average rates as the season goes on. These are nice stats to ride but unlikely to be a portent towards a similarly miraculous future.

At the same time Utd's attack is so hot it burns; they're scoring over 44% of their shots on target, behind only er... Watford (?) from this season, so 2nd of 180.  (Our old friend, and much maligned, but still a decent ready reckoner, PDO has them at 128 for this half season and the next best out of 180 teams is 122. Utd are 14% higher than any other of 180 teams here at this point, which is nuts). It's flaming fire at both ends from Utd without the structure you want from a challenging club to back it up. 52% of the shots remember--"Name other half a season 52% shots teams"-- okay: Palace 17-18? Moyes at Utd? Pulis at Stoke 10-11?

"Mourinho is pragmatic" is one take but I'm not so sure, what is their identity? What kind of football do Man Utd play these days? It still feels like they are firmly in transition with individual brilliance bailing them out, but are they an effective team? Time will tell, but i'm betting on a second half of the season drop off.

Don't believe me? Check out this chart from Objective Footy, which represents the two highly fluctuating and non-predictive metrics I've just described:




Yep, that's Man Utd floating happily in the top right hand corner, and the white space between them and the bunch signifies the journey they are about to take.

Also note Burnley frowns.
And Watford shrugs.

There's a lot of talk about teams--mainly Burnley-- doing special things that move their needle, and it's important to my mind to recall that while that is almost certainly true (see Pulis' Stoke etc), when everything also goes a team's way, it can significantly distort that effect, and hoodwink people into thinking magic is at play. There will be teams that are deploying tactics every bit as shrewd as Burnley, but if they haven't also picked up the breaks, they may look utterly normal to surface analysis. As ever, keep digging, but don't get fooled along the way by variance, and recall too, Burnley's defence might be doing cool, effective things, but their attack is awful (lowest xG/shot of last 5 seasons).

What else?

"Tottenham aren't as good this year"

...is not a take I can subscribe to. They look like the same damned Tottenham side at this point they have for the last two decent Pochettino years. Very good shots team, this year they've dropped to league average on saves, which might mean they're a win or so out of last season's returns at this point. Big picture: they are the same team, doing similar things in a similar way (for all the tactical tweaks), they just haven't found the positive skews to steer them higher--yet (2015-16 had weird high accuracy, 2016-17 had ballooning conversions). Fret not, your Tottenham team is as it ever was, it's just not quite the shiny new toy/novelty it once was.

"Swansea are awful"

That they are. Still the worst attack on record for shots (8.5 per game, 180th/180) and shots on target (2.0 per game, 180th/180 by a huge margin, 0.6 per game). There was a great tweet the other day:



What else could be done? Maybe he lasted so long cos he looked nice; an urbane gent who wore a suit well. That's what managers should look like, right? Regardless, their 27.5% of shots on target in games at this stage is also 180th/180 (bad feeds bad here you see) and the rate of their already league worst shots that land on target is also 180th/180 (23.5%). Also, as if that wasn't enough, they have a weird skew where they can't land shots on target but their opponents can. West Ham last season were super weird by this side metric, and just had to improve (they did, and ranked 180th/180 at this point). Swansea are at 179th/180 so presumably can hope that there is some reversion available here, for all that is like finding one cup of water in a desert.

It's an abomination, and huge improvement is needed to drag them from the mire. From the outside, it doesn't look like they have the quality in personnel to do it and their managerial appointment will be fascinating. And maybe they should listen to Altman a bit more (or maybe not? You decide!)

"Palace are bad"

Aye, they never were and this was one of the more solid analytics/stat takes even from early on. And now they spec out as quite good; remember their shot ratio is higher than Man Utd. Roy Hodgson has little to worry him.

Liverpool

Same as always, electric attack: 69-70% of the shots in games this season and last (seasons rank 2nd and 3rd/180 behind this season's City team), converting at a decent clip: 12-13%. Very few shots at the other end, 7.6 a game this season, 7.7 a game last season (guess what! 2nd and 3rd/180 behind City 2017-18). And by way of contrast a shit-ton of those shots landing in the net, 16.0%, and 179th/180. This hasn't moved since the 12 game mark and Liverpool remain somewhat enigmatic. They remain like a tin of Roses, most of what they are is tasty and admirable, but who continues to sanction the coffee creams?

Arsenal

Shot rates are better than any recent season, save and scoring rates are down. It could well be that these factors are interrelating, but i'd be tempted to be positive about the 66%+ shot metrics and hope the 10% conversion gets a boost later. It's a balancing act, and perhaps Arsenal are just the same damned Arsenal team they always are (funny how often that seems the way when you look at seasons of data) and it'll come out in the wash. They are no worse though, so that's something, and they could be better! (Rejoice!)

Bournemouth

Are worse. Year to year, shots are down, goals are down, more shots against. They've gone from 50% of the shots to 45% to 40% in consecutive seasons. Jermain Defoe is out for 8-10 weeks, so will be interesting to see which direction they trend now... *whistles*

Chelsea

Look remarkably similar to last season in all shot related stuff, and pretty good (69% of shots on target both years is excellent), just with a clip of about 0.5 goals per game lost off their attack. That's likely variance talking, and otherwise they remain solid if unspectacular. It's just that blend powered a title last time round, and it really won't this.

Everton

Are weird, because all their metrics are horrible, and likely won't move very far with the Big Sam brigade in town. But the Sam Magic Wand will get results from horror metrics, while the Koeman Kalamity got none.  All their issues remain fundamentally terrifying with fortunes squandered and an odd unbalanced and bloated squad. 12th place forever is the new 7th place forever, maybe.

Leicester

Are the same as Leicester 2014-15 and Leicester 2016-17 for the most part. I appear to have mislaid other seasons for them, so we will skim over that. Forty-something percent of the shots and an occasional positive bounce. It's all very Leicester.

Southampton

Worrying drop off in metrics here. Last season they were looking like a top six contender in waiting taking 63% of the shots on target at this stage, and it wasn't the first time they had done that--57% to 63% the previous three season-- but in 2017-18 they are taking 38% of the shots on target. That's a huge drop. It's as if their for and against metrics of the last four seasons have flipped and it's becoming harder to cite them as the blueprint change-resistant unit they normally are. Feels like they have the talent to do better but Mauricio Pellegrino's tenure is more Adkins than Pochettino right now.

Watford

Are super weird. 1st/180 converting 47.4% of their shots on target, of which they take very few (3.0 per game, bad 169th/180) and they are saving very little of those that come the other way (60.4%, 167th/180). So nothing is being stopped at either end. If any team is a coin flip for top half AND relegation, it's them. They've lost 8 of their last 11 after all.

Apologies if your team missed out (Stoke, worse than usual but I wrote about them here, West Brom hard to analyse with the Pulis to Pardew handoff doing nothing-- no wins since August! Again I wrote about them here. West Ham's attack has vanished too, but Moyesian methods have so far against expectations done something).

Merry Christmas all!



Saturday, 25 November 2017

12 Game Premier League Shot Analysis

Just a quick article here, hence the old blog, I've written something similar the last three years, but am short of time, so can't do it full justice with charts and what have you. Hopefully it will provide some interesting angles you maybe hadn't considered.

We're using traditional shot metrics for a few reasons:

1. I have them stored at the 12 game point back to 2009/10, so nine seasons, which is further than semi-publicly available reliable expected goals goes back
2. I like historical comparisons
3. I like the fact that you can examine multiple aspects. A lot of xG analysis stops at over/under, and while you're not limited to that, I've got a list of about 20 separate metrics related to shots here, and it'd be rude not to share.

An aside on expected goals, which is the must dreary topic around right now, with everyone from professional pub bores to anyone with a stats orientated twitter account wading in. I think if you asked people what the most important component into a shot is, they'd reply "location" and that's fine, but pare it back and the key aspect forever more is whether or not the shot took place in the first place. Expected goals: actually built from shots, don't forget that!

Anyway...

Man City

No surprises here, after 12 games City are crushing a ton of shot metrics. Over the 9 seasons in the sample, shot rates overall have declined, and the very good teams seem to top out around 17 or 18 shots per game, go back to 2009/10 and it was more. Indeed of the top fifteen teams for shot volume after 12 games, only Liverpool last season and this ranks from this season or last. That makes sense, we know Klopp's team are heavy shooters but also prone to launching a fair few from range.

City's strength is both in attack and defence.
Let's just list a load of what they are scoring well in:

  • Goals: they've scored 85% of the goals in their games, ranks tied for first at this stage with Chelsea 10/11
  • Goals for: 3.3/game, 2nd behind City 11/12--we'll come back to them
  • Goals against: 0.6/game, tied 4th
  • Shots on target: 7.3/ game tied 4th
  • Shots against: 5.8/game, 1.8 ahead of the next best team in the sample, crazy good
  • Shots on target against: 1.8/90, 1st in sample by 0.3/game
  • Shot ratio: 75%, 4% ahead of the next best
  • Shot on target ratio: 81%, ~4% clear
Then we have some less structural metrics related to accuracy and conversions. You'd expect these to perhaps cool or indeed be more prone to vary. Even a team as good as City is more likely to ultimately fall within long term parameters than continue to skew positively all over, though admittedly 180 team 12 game samples leaves plenty of room for new outliers.

  • Shots on target as a percentage of total shots: 43%, 1st in whole sample, highest whole season rate is ~40%
  • The difference between that rate (43%) and their opposition (30%) ranks 1st too, at 13%, full season maximum is currently 9%
  • Shot conversion: 19% ranks 1st, full season best is about 15.5%
  • Difference between their shot conversion and the opposition is 9%, best full season is 7.3%
  • Goals per shot on target, 45%, ranks 3rd and a couple of % ahead of City's 13/14 season
So: we have on our hands a great team, that is also running hot at the extremities. This should be no surprise after watching them, but also a bit of wear and tear plus simple reversion would suggest that aspects of their game should cool a little. If they don't then we have a team for the ages. But also it's worth comparing them to City 11/12 who also started 11-1-0 and scored a ton of goals. Nobody thought that team was going to be one for the ages. They ended 28-5-5 and snuck the league on the last day. This small sample of 12 games implies City are great, but it is still a small sample--for now. 

Burnley

The magic of Burnley has enchanted many this season apart from specifically me and the Everton board. They are defying relatively weak structural metrics (shots, expected goals) to ride high in the league. We know they commit to defence, but it's also true that they are riding their luck somewhat. Similar happened last season, then they eventually came back to earth, but as is likely to happen this year, they already had plenty of points in the bag and were able to limp home, with few concerns.

They have improved this season, but it's important to note they specced out as a terrible team last year. This season the specific measure that cannot--indeed will not maintain--unless we have not one but TWO "teams for the ages" in our league right now, is their opposition shot conversion. It currently is running at 4.4%, so around 1 in every 25 shots the opposition takes is scoring. This is obviously thanks to the "11 men on the goal line" strategy they have deployed and bravo for solving football.

This rate ranks 2nd of 180 behind Chelsea 2010-11 (who had a great start, that proved er... unsustainable, because, guess what! their extremely positive variance came back down to earth). It also means that Burnley are allowing goals at a lower rate than Andre Villas Boas' ill fated Spurs 13-14 team were scoring them. Yep, that's right, Burnley are facing Roberto Soldado every week and it's working out just great. The key point is that the season long lowest rate is Villa 09/10 and that ended up at 6.4%, enough to power them to 6th place--so Burnley could well end up in an extremely good place off the back of this, it's just there's plenty of juice (a whole two percentage points) in there that won't sustain. Very similar happened last year to them, but not to these extremes.

Man Utd

I'm quite down on Man Utd for similar reasons. They are scoring 16% (4th/180) of their shots and allowing 5% (3rd/180). so at both ends of the pitch, they are enjoying very positive skews in their conversions, both of which are outside full season rates. Logically, you'd expect one or both of these numbers to move as the season goes on. They are a good team sure, but a 14 to 10 shots for and against team doesn't really pass muster when we're looking for genuine dominance. They're doing fine, but i'm not gonna put any bunting out just yet. I think they are still quite flawed, and potentially the Mourinho strategic straitjacket hasn't helped either.

Watford

At least part of Watford's stellar start is down to the 48% of shots on target that are becoming goals (recall full season max is 43%). That ranks 1st of 180 here. They aren't bad, and Marco Silva has improved them, but as was mentioned last time I spoke well of them, they have started fast in previous years only to tail off. This time we have a number trigger too that could well explain some of their inevitable reversion.

Swansea

...are getting 1.9 shots on target per game. This is 180th/180. Terrible.
Honestly, so bad.

Liverpool

7.6 shots per game against ranks 2nd/180, 43% of them landing on target ranks 180th/180, 19% of them going in ranks 179th/180. That's the Klopp conundrum right there. His teams stop the shots but the ones that arrive could be scored with a no look finish.

Southampton

21% of their shots are landing on target, Not good (178th/180). Full season minimum is 25% though, so should improve.

Palace

4% of their shots are going in, yes that's AVB levels. They're scoring 20% of the goals, which is a full 5% behind anyone else in the sample. It should get better!


There's probably more, but it's Saturday morning, and the 13th games of the season are about to be played (thanks West Ham v Leicester for dating this already). Hope you enjoyed this, there's an insane, injury inducing amount of football coming up between now and the new year. A lot will change and some of these metrics will be useful indicators when teams suddenly seem better or worse than before.

Salut!

All data via Opta











Wednesday, 1 March 2017

A few thoughts on moving beyond xG and stats in the media

I contributed to an article on Ultimo Uomo recently about how can analytics move beyond xG and stats use in the media.

http://www.ultimouomo.com/a-che-punto-sono-le-statistiche-nel-calcio/

That article was in Italian, and had plenty more contributors, so check it out, but here are my thoughts pre-translation, for anyone interested.



How can analytics evolve beyond xG?

Expected goals provided a strange line in the sand for football analytics primarily because the data required to build a reputable expected goals model fell outside the published statistics from public sites and the technical aptitude to build such a model was specific and not trivial.
So for a long time, people built xG models, or looked on from outside wondering about xG models. This had the effect of slightly focusing the evolution of football analytics around xG related topics—people built ever more complex models, and eventually graduated towards models that valued the movement of the ball wherever it was on the pitch, but the basic concept was still an expected goal value.

More recently studies of passing have become more prevalent a with a desire to identify players and teams that are most efficient or successful at moving the ball. Still though it is difficult to separate descriptions of style from actual beneficial results that tally with winning football matches--a long term core issue with any metric development.

There is a lot of hope for tracking data, that it might add in extra factors that aid precision but nothing is public there yet and it's possible that benefits from that will only be marginal, a charge that could also be laid at the addition of running and sprinting data.
Defensive analysis remains hard to work with at a player level and it must be hoped that advances in quality of data can shed light here.



Stats in the media
The level of stats use in the media has seen a sharp rise in recent years, and with fantasy football, Football Manager and data sites such as Squawka and Whoscored, the acceptance of numerical descriptions of players is firmly entrenched in younger fans' minds. More often we see talk of shot or shot creation numbers for teams and players which add a necessary second layer for analysis beyond goals and assists (xG may still be too esoteric for total mainstream usage) and this is positive.
Less positive is the use of other statistics that do not represent what they are being used for. Defensive stats like tackles and interceptions are often presented in a more = better fashion, when they are little more than descriptive, they do not necessarily reflect quality of play. Goalkeepers cannot be graded accurately by volume of saves and simple lists based on one or two stats do not do a good job of grading players outside of attacking metrics.

So we have more presentation of stats, more description of stats, but a long way to go to before actual nuanced and thoughtful analysis of stats is anything like normal. And there is certainly a knowledge gap here. It takes time and understanding to read genuine meaning into football statistics yet there is a requirement for media companies to incorporate information to their presentations, and maybe only in certain cases is this backed up by genuine understanding. This same problem presents itself inside clubs, where performance analysts are bringing statistics into their work without sufficient grounding in what matters and what does not. Until both the media and football understand that they require knowledgeable people to direct their usage of stats, offerings will fall short of their potential and we are in danger of finding stat use marginalised (in clubs) or used as trivia but nothing more (in media).


We see more visuals in the media now, but again I would caution against their usage without understanding. An average position map, average pass location or heat map, is rarely capable of giving the full truth yet remains popular and often misused. However shot location maps, with or without expected goal values, or specific pass or chance creation maps can reveal significant truths about a game, a team or a player. To say “Look this player always shoots from 30 yards and never scores” and visualise that is a simple method of showing and proving a point. The key points should always be that any number used or visualisation shown adds to the presentation, reveals truth otherwise concealed and is quick and accessible to understand. There is certainly work to do on all sides here and only teamwork between visual experts/storytellers and those who understand the data can make this work best.