FREE hit counter and Internet traffic statistics from freestats.com

Friday, August 17, 2007

Umpires and QuesTec

Several readers have been asking about the recent study that was reported to show umpire bias by race known as the Hamermesh study. Phil Birnbaum and Mitchel Lichtman have been doing great work in that regard already so I have little to add other than providing a few links for those interested:

  • The original study


  • The Time Magazine piece


  • Phil's first take - he questions the author's findings of statistical significance by examining the core table (table 2) from the original study


  • Phil's follow-up - where he uses to conclude that perhaps and at most 1 in 700 pitches is biased


  • And even more by Phil - here he uses several different tests of significance and it appears there is no racial bias


  • MGL's own study - here he uses a much simpler approach and comes to the tentative conclusion that there are not racial differences that are statistically significant. Update on 8/19: MGL posted some updates to his study here and here and comes to the opposite conclusion. He also notes there is a good discussion of the study at The Sports Economist.


  • One of the side topics that have arisen here is the affect of QuesTec on called strikes. The authors of the Hamermesh study found that for both white and minority pitchers, in non-QuesTec parks pitchers received a higher percentage of strikes when the race of the pitcher and umpire matched than they did in QuesTec parks. White pitchers did not experience this difference when the umpire was non-white although minority pitchers still did.

    This provides an opportunity to look at the PITCHf/x data from this season in QuesTec and non-QuesTec parks to get a more granular feel for what the overall difference might be. While we have data for only 9 of the 11 parks where QuesTec is installed, we still end up with almost 35,000 pitches in QuesTec parks and 63,000 in non-QuesTec parks to analyze. When we do so by comparing the location of the pitch to the strike zone (defined by the PITCHf/x operator for each plate appearance) and give the umpires a 1 inch buffer zone to correspond with the limits of the system, we find the following:


    Park Pitches CS% CB% Agree%
    QuesTec 34427 .8252 .9433 .8790
    Non-QuesTec 62862 .8052 .9488 .8772


    By way of explanation CS% is the called strike percentage defined as the percentage of actual pitches in the strike zone that were actually called strikes. CB% is the called ball percentage defined as the percentage of pitches that were actually out of the strike zone that were called balls and Agree% is the overall percentage of pitches on which PITCHf/x (given the buffer zone) and the umpire agreed.

    By simply examining the confidence intervals it appears that umpires do indeed call more pitches in the zone strikes at QuesTec parks than at non-QuesTec parks. The difference is statistically significant at .05 at amounts to 1 pitch in 50. However, at QuesTec parks umpires don't do as well at identifying balls and end up calling more of them strikes to the tune of 1 in 180 pitches. This result too is statistically significant at .05 indicating that perhaps the biggest effect of QuesTec is simpy to call more strikes.

    Because the factors are working in opposite directions when we add them up the Agree% fails to meet the .05 test. Overall then, if we attribute the entire difference to whether the umpire is in a QuesTec park or not we're talking about a difference of 1 pitch in 550. Of course there may be other factors at work here including the calibration of the system at particular parks that may play a role which I haven't examined.

    9 comments:

    Phil Birnbaum said...

    Good stuff, Dan ...

    Question: could the additional called strikes in the QuesTec parks be because the home teams in those parks have better pitchers?

    And one more question: is the PITCHf/X data automated, or does someone have to enter it? If it's manual, should we be worried that the operator will fudge the data to make it a little closer to the umpire's call?

    Dan Agonistes said...

    Thanks Phil. In answer to the first question I don't think so. Better pitchers wouldn't necessarily equate to a higher CS% since we're comparing the actual location of the pitch. If umpires were unbiased worse pitchers may get fewer strikes overall but that would be because they throw more pitches out of the strike zone.

    Of course if there were umpire bias in favor of certain pitchers (what I called "The Catfish Effect" in this article) based on their "veteranness" or something then perhaps there would be a skew. I didn't find that was the case when I looked at it previously albeit with less data than I have today.

    PITCHf/x is automated and so an operator does not enter the location. That said, they don't throw out "bad tracks" today which may be something approaching 5% of the pitches and so that is problematic.

    Phil Birnbaum said...

    Right, those aren't raw strike percentages, so the players involved shouldn't matter.

    I guess we'll never know if the "bad tracks" are correlated with close pitches ... maybe it's safe to consider them random?

    Thanks, Dan.

    Anonymous said...

    I am having several revisions to my study, Dan. Ignore the first one that showed no bias at all. I have to redo the "pitcher and batter adjustments" for the second set of data (posts 24 and 25) which DID show a bias.

    There is also a good discussion at The Sports Economist, http://www.haloscan.com/comments/skipsauer/6714267678478680720/?a=31792#432796.

    I only know of 10 Questec Parks, ana, ari, bos, cle, hou, mil, nyn, nya, oak, and tba. What is the 11th?

    Anonymous said...

    OK, I guess the 11th one is U.S. Cellular.

    Dan Agonistes said...

    Thanks, I've updated the original post with your new links.

    BTW, the QuesTec parks we don't have data for are Shea and Yankee Stadium. We only have a smattering from Tampa Bay as well.

    Tangotiger said...

    You do have a bias, if your Questec parks have pitchers that are really good (or really bad) pitchers. The "obvious" strikes and the "obvious" balls are noise to the data and need to be removed.

    What you should do is show the breakdowns by the called strike % for "obvious strikes" and "edge of zone" strikes. Same for balls.

    John Walsh has a great set of charts a few weeks ago which should form the basis for such a study.

    Dan Agonistes said...

    Excellent point. So what I did below was remove all the obvious balls and obvious strikes from the analysis as well as the small percentage of pitches where the strike zone definition was obviously incorrect.

    When I do this I get:

    QuesTec:
    Pitches 20739
    CS% .8639
    CB% .8909
    Agree% .8748

    Non-QuesTec:
    Pitches 37252
    CS% .8448
    CB% .8921
    Agree% .8665

    Looks pretty much the same with a significant difference in the CS% at .05. Perhaps I need to reduce the box I'm using (3" by 3" at the center for excluding obvious strikes and below 1 foot and above 4 feet and beyond 14 inches on either side of the plate for excluding obvious balls).

    Tangotiger said...

    Good job. Your numbers show that 40% of pitches are "obvious" calls. Sounds a bit low to me. I would have guessed anywhere from 60% to 75%. Looks like we've got to watch a game today, and see how many pitches are obvious calls.