PDO doesn't measure luck.
When I tell people this for some reason they don't believe me. Then when I explain it to them in any number of different ways they still assume I'm wrong somehow or am missing something because it's such a widely accepted conventional wisdom. But I'm not wrong, say it with me:
PDO doesn't measure luck.
Now I fear what you're hearing me say is "you can't measure luck", but that's not true there is luck in hockey and it's actually pretty easy to measure. You can get a pretty basic measure of luck without a lot of overly complex math, it's that easy to do. Just don't use PDO to do it.
PDO doesn't measure luck. The fact that PDO is so widely used and accepted and that no one questioned the faulty core assumptions when it was created is a terrible black-eye on the advanced stats community. Team PDO actually does measure something, it's just not luck (it's actually a metric of relative skill under a certain controlled circumstance), individual player PDO is pointless and doesn't measure anything of usable value. PDO needs to be removed from the list of hockey metrics because it's based on flawed, faulty assumptions and someone should have caught that by now.
When you look at a list of teams ranked by PDO you're not looking at a list of teams ranked from luckiest to unluckiest. When you look at a list of players ranked by PDO you're looking at a set of players listed arbitrarily that has nothing to do with luck.
Here's how Behind The Net.ca defines PDO:
PDO is the sum of "On-Ice Shooting Percentage" and "On-Ice Save Percentage" while a player was on the ice. It regresses very heavily to the mean in the long-run: a team or player well above 100 has generally played in good luck and should expect to drop going forward and vice-versa.
This is wrong.
Conventional Wisdom: PDO is a measure of luck with a baseline of 100. According to this conventional wisdom a team with a PDO greater than 100 is the beneficiary of luck and can expect their PDO to trend downward to 100 as more games are played and their data set increases and vice versa.
Actual: PDO doesn't measure luck and a PDO not equal to 100 does not in any way indicate that a team is lucky/unlucky.
Let's look at this from a number of different approaches.
Approach 1 - Simply Think About It
This is the core assumption of PDO: the combined shooting percentage and save percentage of every team should be the same at 100 and any variation is due to luck. Just think about this for a second, does this make any logical sense? Does it make any logical sense that Tuukka Rask and your grandfather should have the same save percentage except for luck? Does it make any logical sense that Steven Stamkos and your grandfather should have the same shooting percentage except for luck? Now if you combine the shooting percentage of a team and the save percentage of a team does it make any logical sense that the combined ability of every team to score a goal or stop one is exactly equal except for luck? Does that really make any logical sense to you?
PDO congregates around 100 for a very simple reason, every shot on goal has a binary outcome, it's either a save and adds to the goalie's save percentage or it's a goal and adds to the shooter's shooting percentage. So if you took every shot on goal in the league into your dataset the sum of save percentage and shooting percentage for the whole league is 100.
Just because the total of every shot taken in the entire league should produce a PDO of 100 doesn't mean that the same holds true for any subset of that data, which is why teams have different PDOs. If you were isolating this calculation to one particular team why would you expect them to be equal to all other teams, except for luck? Does this make any sense? Just think about this logically for a second, why would you believe in a metric whose core assumption is that all teams' combined ability to score or stop a goal is equal, except for luck?
Why do you think that PDO measures luck? If your answer is anything other than because everyone says so then let's talk but otherwise at least accept the premise that you should probably investigate this a little deeper before just accepting something with assumptions so simple and obviously wrong.
Imagine a team with the worst goaltending in the league and skaters who are bad at converting their shots into goals. This team is going to have a low save percentage and a low shooting percentage. When you add them together those numbers are going to be well below 100. When their PDO is well below 100 it's not because they're unlucky it's because they're bad at hockey. A team with a PDO of 98 isn't unlucky, it just means that they're bad at the combined measure of stopping and scoring goals measured in the isolated situation where a puck is directed at net compared to the league average of 100.
This is really simple and basic and doesn't require a particularly thorough investigation (we'll still do that anyway). It's perfectly logical that if you're really, really bad at stopping and scoring goals you should have a low PDO and if you're very, very skilled in shooting and stopping pucks you should have a high PDO for reasons that have nothing to do with luck and everything to do with very basic arithmetic. This should already be obvious because PDO doesn't measure luck.
It's not that you can't measure luck, you can and it's actually pretty easy it's just that PDO doesn't measure luck.
Approach 2 - Revisit Its Core Assumptions
Where does PDO come from? PDO is named after the name of a commenter on an Edmonton Oilers blog who coined it. Here is what he said that lead to the assumption that the baseline for teams/players should be 100:
Lets pretend there was a stat called "blind luck." Said stat was simply adding SH% and SV% together. I know there's a way to check what this number should generally be, but I hate math so lets just say 100% for shits and giggles.
That's where this conventional wisdom comes from, I'm not making this up. (Source) The amount of thought that went into creating PDO as a way to measure luck really is a) that poorly thought out and b) stress tested at the "shits and giggles" level.
Now if you think I must be wrong because PDO is so widely accepted please reread the above and note that the creator a) doesn't bother to check to see that his baseline of 100 is the proper baseline to use and b) hates math. If you still think I have to be wrong somehow just remember you're backing the guy who doesn't check his work and hates math, and I'm not.
There actually is a way to check this, the math isn't particularly complicated and the fact that no one has bothered to check this and wholly and completely discredit it is really embarrassing. We routinely mock the people who don't take advanced stats seriously yet some of our measurements and assumptions are this poor.
This is how you know that hockey "advanced stats" are still really primitive, when you read PDO's core assumption as it was originally written it's so laughably simple and obviously wrong and yet it's still somehow widely accepted and regularly used. This does not reflect well on us.
In an effort to beat this horse completely to death so that it dies, stays dead and never comes back to life please reread the above. PDO has been around since at least 2008. It's incredibly embarrassing that in 6 years no one has thoroughly discredited it and it's so widely used by otherwise smart and credible people.
Approach 3 - A Thought Exercise
Let's try two thought exercises, one for Team PDO and one for individual PDO and you can see that they don't measure luck.
If you still don't believe me let's try our first thought example on Team PDO. The faulty conventional wisdom is that if your PDO is over 100 you're lucky and if your PDO is under 100 you're unlucky. Luck in hockey is a really simple concept, that you are going through a period of producing output that is substantially different than what you would expect based on true ability. The core assumption is that a team producing exactly at its true ability should have a PDO of 100. You can discredit PDO as a measure of luck if you can demonstrate an example of a team producing exactly at its true ability where a Team PDO is not equal to 100.
Here is the exercise: imagine a scenario where hockey is reduced to a team of one skater and one goalie facing 30 teams with the same makeup of one skater and one goalie and imagine for a second that this hypothetical team of one skater and one goalie was able to play their entire 20 year careers together. When you take the dataset of their entire careers you are literally getting a dataset that matches their 20 year career average, whatever their career averages are is a measure of their actual true ability and the value of luck is as small as you could ever possibly make it. (with a small dataset you can have luck or a "hot streak" but as you add more data to shooting/save % that noise goes away and eventually you will start to approximate long-term career avgs, in this example we're not approximating career averages we're controlling to the point of using actual 20 year career avgs so that we can say there is no luck involved at all) Now let's say that skater's career shooting percentage was 8.5% and that goalie's career save percentage was .925 for a 20 year team PDO of 101. You didn't just measure how lucky they were you measured how much better their combined skill was relative to a league average of 100 when it came to shooting or stopping pucks. This isn't a measure of luck.
Now that's an indictment of team PDO which is a metric that has some limited value so long as you understand its limitations and are using it to draw logical conclusions understanding that you are measuring ability and not luck. Let's address individual player PDO which is a pointless metric that I often use in conjunction with other really pointless metrics such as PDM (shooting % + avg price of milk), PDQ (shooting % + NASDAQ) and PDX (shooting % + current surface temperature of the player's favorite exoplanet).
This exercise is really simple. Individual player PDO is pointless because it combines two numbers that are in no way related and combining them tells you nothing of value. PDO is the combination of two numbers, in this thought example we're going to hold one of those numbers constant and see what happens. If individual PDO tells you anything of value you should be able to draw some sort of useful conclusion about Player A and Player B if I told you Player A had a PDO of 102 and Player B had a PDO of 98. But as you'll see individual player PDO doesn't tell you anything of value because the numbers are essentially arbitrary.
Here's a really simple thought exercise to demonstrate how worthless individual player PDO is. Let's say over the course of a season a player has a shooting percentage of 8% (about league average) and his linemates throughout the year have common average shooting percentages and the player has essentially the same output that he's had for each of the last few seasons. The shooting percent part of PDO in this example is average and consistent with his career numbers so there's nothing special here, just an average number based on average output matching the expected value of an average player. Now let's say he's on a team with terrible goaltending that has a .900 SV% and after all the calculations his individual PDO is about 98. Now imagine the exact same player with the same shooting percentages, output and teammate ability who's on a team with great goaltending and his individual PDO is about 102. What did you learn by comparing a player with a PDO of 98 to a player with a PDO of 102? You've learned nothing. We've taken two identical players and shown that they can have a very high individual player PDO or a very low individual player PDO despite having the exact same completely average play. You've taken a number the player has some ability to influence (shooting percentage) and combined it with a number that the player has virtually no ability to influence (his goalie's save percentage) and tried to glean some sort of knowledge but by design (the design of PDO is to combine these two numbers) you aren't separating the influence on the metric that the player can control (shooting %) and the part that he can't (save %) so you have no idea simply by looking at the number if it's telling you anything about the player at all. Not only does it not measure luck it doesn't really measure anything.
|Player A||Player B|
|Current shooting %||8%||8%|
|Current teammates shooting %||8%||8%|
|Career shooting %||8%||8%|
|Last season's shooting %||8%||8%|
|Goalie's save %||0.900||0.940|
If individual player PDO measured anything worth knowing in any way you should be able to look at the numbers of two players and draw some sort of conclusion based on the difference in those numbers. But you can't, it measures nothing.
Now you might be tempted to say that the reason a player could have a PDO of 98 or 102 despite playing at the exact same level is somehow proof that it measures luck, but it's not. The only difference between Player A with a good goalie and the exact same player with a bad goalie is the ability of the goalie. The difference between a good goalie and a bad goalie isn't luck, it's ability. Sure if you were a hockey player you might feel pretty fortunate to play on a team where the goalie is Tuukka Rask and not your grandfather but the reason that Tuukka Rask can stop more pucks than your grandfather is not because Tuukka's luckier than him, it's because he's a talented goaltender in the prime of his career and your grandfather isn't. (I have no doubt that your grandfather is terrific and wonderful though, unlike this)
|High Skill||High Skill||High PDO||Nope|
|High Skill||Avg Skill||High PDO||Nope|
|High Skill||Low Skill||Avg PDO||Nope|
|Avg Skill||High Skill||High PDO||Nope|
|Avg Skill||Avg Skill||Avg PDO||Nope|
|Avg Skill||Low Skill||Low PDO||Nope|
|Low Skill||High Skill||Avg PDO||Nope|
|Low Skill||Avg Skill||Low PDO||Nope|
|Low Skill||Low Skill||Low PDO||Nope|
Look at the table above, a skater has very little ability to influence the first column and can influence the second column a fair amount. As you can see from the table if you're an average skill skater you could have a high PDO, a low PDO or an average PDO due to factors completely outside your control and having nothing to do with luck.
Not only does individual player PDO not measure luck it really doesn't measure anything at all. It's pointless.
Approach 4 - A Demonstration Using Actual Numbers
Let's look at an example with actual numbers. The intent again, as was used in the thought example above, is to control one part of the metric (the shooting %) and then calculate the other part in a way that we feel a high level of confidence is minimizing luck. If you get a number that is not 100 and you've gone through a process that you're pretty sure is minimizing luck then you can say that the core assumption of PDO is wrong, that all teams should have a PDO of 100 except for luck.
Here's an exercise I went through pretty late in the season back in March using actual data from the 2013-2014 season.
Note: any data used here is based on the numbers available on the morning of 3/29/14 before any games were played that day.
Is a team with a PDO of 101.973 a lucky team?
Well let's go through an exercise and for the purpose of this exercise we're going to assume that the team in question has a team shooting percentage of the league average, nothing special. The whole premise here will be very simple, if you control by assuming a very average shooting percentage with no luck and you can show that a team has above average goaltending due to skill and not luck you should come up with a PDO number greater than 100 because of skill and not luck. This is a really simple exercise to demonstrate that PDO doesn't measure luck. (it doesn't)
I have come across a few, a small few, people trying to discredit PDO at some point in the past and each time they've tried to dig into the shooting % side and in each case they have wound up down the rabbit hole of shot quality. The shot quality debate can be a black hole so I am going to set up this experiment to avoid it. The method below is going to control the shooting % side of PDO at an average rate and just focus on the goaltending side to show that PDO fails as a measurement of luck.
Let's look at a real team and use actual data, the current Boston Bruins (as of the morning of 3/29/14, through 73 games played). Looking up the even strength save percentages of their goalies this season, the combined totals for even strength goals/shots against for this season's Boston Bruins is that they've given up 103 even strength goals against on 1,717 even strength shots against for an even strength save percentage of .940. Adding up the even strength goals against and shot totals for every goaltender in the league this season gives you a league average even strength save percentage of .921, so the Boston Bruins are earning 1.9 even strength PDO points over the league average on the goaltending side this season just for the saves their goaltenders are making.
Are their goaltenders lucky or good? Well I can't find any career totals for even strength save percentage but I can find career totals for all situations save percentage. Tuukka Rask has made 71% of the Bruins saves this year at even strength, his all situations save percentage this season is .931 which is just a shade above his career average of .928. Chad Johnson has made 27% of the Bruins saves this year at even strength, his all situations save percentage this season is .925 which is just a shade below his career average of .926. (note: Johnson only had 10 career GP prior to the 2013-2014 season so using his career average as a baseline is a weakness here) Are the Bruins benefitting from lucky, unsustainable goaltending? It doesn't appear that way, it looks like they have very talented goaltenders who are having season totals pretty consistent with the output you would expect from them given their career totals. The Bruins goaltending numbers seem entirely sustainable and not due to luck.
Now let's say for argument's sake that this year's Bruins team had a league average shooting percentage. Well the league average even strength shooting percentage is 7.9%. (1 - league average even strength save percentage of .921 = .079) Now remember that the Bruins don't have to shoot against their own goalies, so if you take out the Bruins save totals from the league totals you get a league-wide non-Bruins even strength save percentage of .920 so an average shooting Bruins team would have an expected even strength shooting percentage of 8%.
Now if you combined the Bruins actual even strength save percentage with a hypothetical league average shooting ability and took out some of the rounding you'd get a PDO for this team of 101.973. We've controlled for any luck on the shooting percentage side by just using a league average and we've shown that the save percentage pretty well tracks with career totals.
A team with a PDO of 101.973 isn't necessarily a lucky team benefitting from unsustainable play. You can't look at a PDO value and come to the conclusion that you're looking at a measurement that measures luck.
PDO doesn't measure luck.
Section 5 - Is the Misuse of PDO Really That Widespread
Here it is being very poorly defined at Behind The Net.ca.
And the CBC.
And Hockey Prospectus.
And so on.
Section 6 - What Does PDO Actually Measure
Individual player PDO is pointless and measures nothing so let's just disregard it completely.
Team PDO isn't entirely worthless and has some value once you completely and totally accept that it doesn't measure luck and actually measures skill and ability. If you wanted to find out which teams are particularly skillful during those moments in a hockey game when a puck is actually traveling at the net a team's PDO will tell you the measurement of each team's ability relative to one another with a league average of 100.
But if PDO measures skill shouldn't the best teams all have a very high PDO and all the worst teams have a very low PDO? Not necessarily. The answer could be yes or it could be no. Remember this is just a measurement of ability when the puck is actually traveling towards one of the nets and those opportunities are not created equally. If every game featured the exact same amount of shots on goal for each team then PDO would be good at predicting a team's win loss record, but hockey doesn't work that way, the SOG are not necessarily equal for each team. You don't have to be particularly skillful to consistently win games where you outshoot your opponent 50-15, and you better be especially skillful at both ends of the ice if you want to consistently win games where you get outshot 15-50.
The puck is not traveling at the net unblocked during most of a hockey game but during those moments when it is traveling at the net if you wanted to measure the relative ability of each team to score or stop that puck compared to a league average of 100 then team PDO would tell you that.
Section 7 - But, But, But ... I Still Have Questions
Why is it ok to combine shooting % and save % for an entire team for team PDO to get something of value but doing the same thing for an individual player is pointless? When you're evaluating a team as a whole it's the team's job to both stop pucks and score them, for that team both numbers are relevant. The same does not hold true for an individual. It's not Henrik Lundqvist's job to score goals, that's someone else's job. It's pointless to combine a number he can control (his save %) with a number he can't control (his teammate's shooting %), in doing so you'll learn nothing. It's as useful as combining his save % with his car's license plate number.
We all know that Team A is lucky and they're winning at an unsustainable rate and they have a high PDO so PDO just has to measure luck, I mean it has to, right? Still no. You can be lucky and have a high PDO. You can also have a high PDO without being lucky. You could also have a lucky team with an average or low PDO. You are going to find instances where a team has a high PDO and an actual calculation of luck shows them to be lucky but just because that is the case does not mean that PDO measures luck.
Remember, you can measure luck, it's pretty easy to do. It's just that you shouldn't use PDO to do it because PDO doesn't measure luck.
But I still want to use PDO because it's helpful when I'm trying to make certain arguments about how a player is getting praised/killed when he doesn't deserve it? Here's the question you should be asking yourself, why are you using PDO to make your point? If you want to say some stats about a team or player are good/bad because the goaltending has been good/bad in an unsustainable way why not look at the goaltending numbers in depth to make your case? Why arbitrarily add the shooting percentage taking place at the other end of the ice in with these goalie stats? The combination of these two numbers is arbitrary so how does that help your case? Conversely if you wanted to say that someone is on an unsustainable run of good/bad shooting why would you care what the save percentage is at the other end? Why not just dig into the shooting numbers and make your case? My point is simple, in almost every situation there's a better way to make your case than using PDO so stop using a flawed metric when other better options are available.
Now there may be times when you want to talk about a team going through a hot/cold streak at both ends of the ice, if that's the case I'd still advise you to discuss both the save % and the shooting % individually. There may be instances when both subjects are relevant to your discussion and if that's the case then discuss both subjects but PDO automatically combines the two numbers regardless of whether or not it actually makes sense to combine them. My advice is to just focus on the subjects that are important to whatever point you're trying to make.
The 2013-2014 Colorado Avalanche won possibly the toughest division in hockey with a CF% of 46.6%, a FF% of 46.5% and a SF% of 47.5% and a PDO of 102, that has to prove that PDO measures luck right? Well no because PDO doesn't measure luck, but that doesn't mean they weren't lucky it just means that a PDO of 102 doesn't automatically tell you they got lucky. If over a full hockey season you win a lot of games while getting heavily outchanced you're going to have a high PDO, this is true because of basic arithmetic. The inverse holds true as well, if you have a bad record despite heavily outchancing your opponents you're going to have a low PDO. This is not because of luck, this is because of math. The two numbers that combine to make PDO have SOG in the denominator.
It's entirely possible that many players on the Avs had career best years with a high variance relative to their expected output and that this kind of performance cannot be sustained in future seasons but to know for sure you need to calculate their expected output and measure the variance. Simply looking at PDO doesn't tell you that. PDO doesn't take into account what a team's expected value is and just incorrectly assumes it's 100, which is why PDO fails.
Let's put this another way, let's say you were told that Team A was going to be heavily outshot throughout the course of a season and you were asked what they would need to do to come out with a strong winning record. You would say that they're going to have to save more shots than an average team would be expected to save and they're going to have to score more goals on their limited opportunities than would normally be expected. Another way of putting that last sentence is that they'll need to have a better than league average save % and a better than league average shooting %. If you added a better than league average save % and a better than league average shooting % you'd get a better than league average combined score, you'd get a PDO over 100. If you're winning at a high rate while getting heavily outshot you have to have a high PDO.
If a team is winning and has a CF%/FF%/SF% under 50% you don't need to look up that team's PDO, it has to be over 100 for reasons of simple math. Now in this case in order to determine if this team is lucky or just skilled you need to determine if it's because the team has better than average skill or if it's because the team performed far better than their expected output in these areas. PDO doesn't account for the expected value in any way and that's where it fails.
Section 8 - I Know Some Teams Are Lucky, I Still Want to Measure Luck
It's not that you can't measure luck, you can, it's not that hard. It's just that PDO doesn't measure luck so don't use PDO to do it.
Is a player or team performing at an unsustainable rate? Can this be measured and demonstrated with math? Sure. The first thing you do is to create a baseline for that player or team's expected performance and then you compare it to current or recent performance and measure the variance. If you come up with a very high variance it's likely unsustainable.
For a single player or goalie a pretty simple baseline of expected performance would be to look at his career average, the players' career averages are widely available. Since players in their prime perform better than rookies or declining veterans a somewhat better way would be to use age adjusted career averages. This would be a more complicated measurement but the advanced stat community in baseball use age adjusted metrics all the time so with a little bit of work you should be able to build a usable model. If you wanted to add additional complexity you might try to find other ways to improve the baseline for rookies and young players whose career averages are based on small amounts of data. But the essential point remains the same, to measure luck you need to develop a baseline of expected performance, something that PDO fails to do.
For a team measurement you would just follow the same steps as above and then use a weighted average or some other measurement that adequately captures the proper proportion of the total from the players involved.
Remember, I'm not saying there isn't luck in hockey and I'm also not saying that luck can't be measured. There is luck in hockey and you can measure it. Just don't use PDO to do it because PDO doesn't measure luck.
Let's Put This Embarrassing Chapter Behind Us
PDO needs to go, it is a black eye on the stats community that a measurement this obviously flawed has survived for so many years with so little scrutiny. If you want to use it to describe the relative skill difference between two teams during the point in the game when a puck is travelling at the net then fine go ahead and use it but it doesn't measure luck, it's never measured luck, that's not what it measures and it's being used wrong. We need to stop being wrong about measuring luck, quickly correct our mistakes and immediately move on from this embarrassing episode in the growth of hockey metrics.
Send your hate mail to: @ScottTKennedy