Bringing Down The Curve: Poor Umpire Performances

Having a consistent strike zone is important to both hitters and pitchers as errant calls can tip the balance in an at bat or even in an inning or game. Most umpires are actually very good at calling balls and strikes, but there are some outliers. Author Ian York uses PITCHf/x to examine the worst of the worst.

Major-league umpires are all very, very good at calling ball and strikes, but some must be better than others. Can we identify which ones are among the best?

In a recent article, “out-of-zone” balls and strikes were defined by identifying the strike zone and then counting the pitches outside the strike zone that were called strikes and those inside the zone that were called balls. These are both “out of zone” (OOZ) calls: add them up and divide by the number of called pitches. This article worked out the real-life strike zones for each year. Umpires call a smaller zone in pitchers’ counts and a bigger one in hitters’ counts, so the analysis is limited to the roughly 63% of ball/strike counts in which the strike zone is neutral (that is, 0-0, 1-0, 1-1, 2-1, or 3-1). It also excludes pitches that overlapped the edge of the strike zone, since it is not obvious that they would be “out of zone” no matter what they were called. Only umpires who called 2500 or more pitches in a season were included.

One interesting finding from looking at overall OOZ percentages is that umpires have significantly improved their ball/strike accuracy since 2008, when PITCHf/x was installed in all major league parks.

Average out-of-zone percent since 2008

The improvement became even more drastic starting in 2010, when a new umpires’ contract allowed video review of umpire calls and evaluation of umpires based on these reviews. In any case, umpire ball/strike accuracy improved each year from 2010 to 2013, with the biggest jump between 2011 and 2012. It bears repeating that umpires were already exceedingly good at calling balls and strikes. Even in 2008, umpires got 95.9% of their calls right, based on the de facto strike zones for that year. By 2014, umpires were correct on 96.7% of their calls.

In 2014, 82 umpires called enough pitches to meet the 2,500 pitch cutoff. The individual numbers ranged from a high of 4.6% OOZ (Chad Fairchild) to a low of 2.1% OOZ (Chris Segal). That 2.1% represents the best year for ball/strike calls since at least 2008; the worst, at 6.3%, belonged to Doug Eddings in 2008 (see below for some qualifiers on Eddings).

We can use PITCHf/x to show every out-of-zone pitch Fairchild, Segal and Eddings called in a neutral count in those years. The overall distribution of all called pitches for each umpire is shown as a contour plot in the background:

These umpires are the extremes, but there are several things about these maps that are typical of most umpires. Most of the missed calls were close to the edges of the strike zone, although a handful were well away. Left- and right-handed batters were subjected to about the same percentage of missed calls. Segal called more out-of-zone balls than out-of-zone strikes, which is a little unusual. Eddings and Fairchild, like most umpires, generally have a few more out-of-zone called strikes than pitches in the strike zone being called balls, but the difference in frequency between the two is usually small (in 2008, on average, it was 2.4% vs. 1.7%; by 2014, 1.65% vs 1.62%).

Different umpires tend to miss calls in particular regions ‒ the top or bottom of the zone (though often not both) and the outside edge are the most common trouble spots. For example, the charts for Bill Welke and Bob Davidson show common patterns. (Despite these missed calls, both Welke and Davidson were highly accurate, correctly calling 96.9% and 96.1% of pitches, respectively.) Welke’s strike zone in 2014 tended to be slightly higher than average. He called more OOZ strikes at the top of the zone and more balls at the bottom, even though those were not always the most common spots for pitches:

Davidson, in contrast, tended to call more OOZ pitches on the sides of the strike zone, especially at the outside of the plate, while missing relatively few at the top and bottom.

There were 50 umpires who met the pitch-count cutoff every year since PITCHf/x was introduced. The umpire performances were split into two sections: pre-review (2008-2010) and post-review (2012-2014).

A number of things jump out from these charts. Most obviously, the actual numbers are lower for the post-review period. The worst season any umpire had after 2011 would be no worse than middle of the pack before 2011. The range is narrower, too; the best and worst umpires are more similar to each other after 2011.

There can be a fair amount of variation among seasons, as seen from some of the wide range bars. For example, Chad Fairchild, who had the highest OOZ percentage in 2014, was one of the better ball/strike callers in 2013 and was mid-ranked in 2012 (13th of 74 and 45th of 75, respectively). But many of the umpires, including some at either end of the spectrum, are quite consistent. Fielden Culbreth, Joe West, Jerry Meals, and Bill Welke have always been among the best at minimizing out-of-zone calls. Doug Eddings and Larry Vanover were consistently among the worst.

Another important point is that “called correctly” means that the strikes were in the strike zone that the average umpire called. If we were to use the rulebook strike zone for this analysis, we would end up with much larger numbers of out-of-zone calls, around 15%, for every umpire. In reality, umpires have never called the rulebook strike zone, so that is more an abstract question.

In addition, not every umpire calls exactly the same strike zone. Some umpires have a larger strike zone than others; some have a smaller zone. Is it fair to apply the average strike zone to an umpire who typically calls a larger zone?

This is not just a rhetorical question. Doug Eddings has one of the highest rates of OOZ calls, and he is one of the umpires who calls a very large strike zone. So is Tim Welke, who also has a high out-of-zone percent.

Here is the probability of Eddings calling a strike in regions around the average strike zone in 2014; red areas show regions where he is more likely to call a strike than the average umpire, blue areas are where he is less likely to call a strike:

If Eddings has a large strike zone but calls that zone with great consistency and predictability, then his calls would not really be “out of zone”. Rather, he would have established his own zone and then kept his calls within that zone. Are we unfairly charging Eddings and Welke with too many out-of-zone calls?

The answer is probably “yes, but not by much.” First, having a larger strike zone means that Eddings will be charged with more called strikes out of zone, but on the other hand he will be charged with fewer balls out of zone. These should almost balance each other out, although as noted above there are usually a few more OOZ strikes than balls.

Second, if we look at the probability maps, much of the reason Eddings has a larger zone than other umpires is that he is more likely to call strikes for pitches that overlap the edge of the average zone. We can see this more easily if we look at every single called pitch Eddings had around the edge of the zone in 2014, with balls in blue and called strikes in red:

Even though Eddings called about 4.2 extra strikes per game compared to the average umpire in 2014, many of those called strikes come from pitches overlapping the edge of the average strike zone. In the analysis here, we have explicitly excluded those pitches from consideration.

Third, Eddings and Welke were not the only umpires with large strike zones in 2014. Ron Kulpa, Bill Miller, and Angel Hernandez also had larger than average strike zones. While Kulpa, Miller, and Hernandez rated in the bottom half of out-of-zone calls (50th, 63rd, and 64th out of 82, respectively) they were much better than Eddings. For that matter, while Tim Welke’s average OOZ ranking from 2012 to 2014 was mediocre, in 2014 (when his strike zone was larger than he had ever called before) he ranked just 40th out of 82 umpires for OOZ calls.

It may be likely that Eddings is getting a little bit short-changed for out-of-zone calls due to his larger than average strike zone, but it is unlikely that this effect is very large. In any case, the great majority of umpires ‒ including most of those at the ends of the OOZ spectrum ‒ call a zone that is very close to average, so the macro effect of varying zone sizes should be slight.

The lessons we can see from these plots are:

  1. Umpires really are very good at calling balls and strikes
  2. Some really are better than others
  3. This skill is moderately reproducible from one year to the next, but there seems to be a significant contribution of chance as well
  4. This is probably is a skill that can be improved, given appropriate review and feedback.
  5. However, the lack of continued improvement from 2013 to 2014 suggests that umpires on average may have reached the limit of their abilities.

Calling balls and strikes is an important part of an umpire’s job, but it is something they only do once every few days. Fans’ and players’ judgement of an umpire’s ball/strike calling ability is probably influenced as much by other aspects of umpires, such as their personality, their perceived show-boating, or their ability to run a game, as by their accuracy. Umpires undoubtedly miss calls, probably several times a game, but it seems unlikely that automated strike calling could do much better than the 98% accuracy that the best umpires already achieve.


Follow Ian on Twitter @iayork.

1 COMMENT

  1. 2017 ALCS Game 1 showed how much Umpire #4 is “OOZ” calling balls & strikes. Replays of Aaron Judge alone give testimony to how poorly this umpire performs his craft.

LEAVE A REPLY