FREE hit counter and Internet traffic statistics from freestats.com

Saturday, January 26, 2008

The Moral Hazards of the Hit Batsmen

This the final in a series of three columns I wrote for BP on the topic of hit batsmen. You can find the other two on this blog as well. It appeared on May 18, 2006



Schrodinger's Bat:The Moral Hazards of the Hit Batsmen
by Dan Fox

"The designated hitter rule is like letting someone else take Wilt Chamberlain's free throws."

--Rick Wise (1974)

In the previous two weeks, we’ve been looking at historical hit by pitch rates and their trends, and investigating a variety of theories that have tried to explain the fluctuation of those rates. We’ve looked at a wide variety of theories that account for factors such as aluminum bats at the amateur level, changes in the strike zone, the increase in body armor, intimidation, retaliation, and even the win expectancy of the hit batsmen. While individual theories may lack explanatory power for a specific period of time, taken together they do provide insight into the sometimes opposing forces that underlie trends in baseball's complex competitive environment.

There is one trend, however, that we failed to discuss. So this week we’ll take a look at the difference in league rates of hit batsmen since the introduction of the designated hitter in 1973. This topic has been taken up before, so we’ll start by covering some of the old ground, and then hopefully add something new to the discussion.

Setting a Baseline
Before we discuss what the impact of the DH on HBP rates might be, let’s lay out the raw facts that have inspired so much conjecture. The following graph shows the percentage of AL hit batsmen per 1,000 plate appearances as opposed to the NL since the DH was adopted in the American League in 1973. The shaded line is a three-year moving average.



What this shows is that from 1973 until the mid 1990s the rate of hit batsmen in the AL was anywhere between 3% and 30% higher than in the NL. While that’s a wide range, more typical values are between 10% and 20%, with the average during the period being 17%:


1973 9.3% 1990 19.0%
1974 14.0% 1991 18.6%
1975 7.8% 1992 20.0%
1976 17.3% 1993 9.4%
1977 17.2% 1994 -7.6%
1978 12.5% 1995 -6.7%
1979 8.4% 1996 2.5%
1980 24.2% 1997 -15.9%
1981 22.8% 1998 4.4%
1982 3.2% 1999 10.2%
1983 19.1% 2000 -17.6%
1984 29.7% 2001 7.0%
1985 21.5% 2002 7.4%
1986 26.5% 2003 3.4%
1987 16.7% 2004 7.7%
1988 20.9% 2005 -5.6%
1989 22.9%


Around 1994, things began to change and in the following dozen years HBP rates in the NL actually surpassed those in the AL five times, including in 2005 where 9.52 batters were hit per 1,000 PA in the AL, against 10.05 in the NL.


So in fact, there are actually two questions that we can ask about this trend. First, what accounts for the difference in rates of hit batsmen during the twenty-year period following the introduction of the DH (1973-1993), and secondly, what caused those differences to shrink in the period after 1993?

A Moral Hazard or More Opportunity?
As mentioned in the introduction, the topic of league differences in HBP rate have been researched in the past. Most recently, Lee A. Freeman wrote an excellent article titled "The Effect of the Designated Hitter Rule on Hit Batsmen" in Volume 33 of The Baseball Research Journal. In it, Freeman provided a short synopsis of the previous work, citing articles in the journal Economic Inquiry in 1997 and 1998, as well as a follow-up in a 2004 issue of the Journal of Sports Economics.

Prior to Freeman’s paper the two theories that had been bandied about to explain the difference (at least from 1973 until the mid 1990s) were the "moral hazard theory" and the "lineup composition theory." The former theory argues that because American League pitchers needn’t fear retaliation with the presence of the DH, they are more apt to hit opposing batters since they don’t bear the costs of their actions directly. The latter theory also argues from a cost-benefit basis, although differently--AL pitchers hit more batters because the cost in terms of run scoring when hitting a DH is so much less than hitting a pitcher. This follows from the fact that the designated hitter is much more likely to be an offensive producer than your typical weak-hitting full-time hurler.

As a variation of the lineup composition theory, Freeman contended that more hit batsmen in the AL can be explained largely (but not totally, as he rightly cautions against single-theory explanations) simply by more "true" hitters coming to bat in the AL. In his words:


American League pitchers are not given the opportunity during a game to 'ease up' their delivery to the opposing pitcher. As a result, AL pitchers are likely to 'want' or 'need' to pitch inside to more batters during the course of a game, thereby increasing the chances of these batters being hit by a pitch.

Through an analysis of average HBP per season and per team, both before and after the introduction of the DH, Freeman concludes that there is no statistical significance (at the .001 or .005 levels) to the differences in hit batsmen across the two leagues once you adjust the averages for the fact that in the AL approximately 12.5% more true hitters come to the plate in the DH era.

What this analysis lacks, as admitted by Freeman himself, is a more granular accounting for the differences in the number of "true" hitters, and instead relies on a quick and dirty approximation. Using Retrosheet data, we can address that weakness in the study.

The following table shows the percentage of plate appearances consumed by each fielding position, along with the HBP per 1,000 plate appearances for both the AL and NL in the period 1973-1993.


<----AL----> <----NL----->
HBP / HBP /
POS PAPct 1000 PAPct 1000
------------------------------------
P 0.0% 0.0 6.8% 2.2
C 10.1% 6.3 10.4% 4.9
1B 11.1% 4.9 11.3% 4.8
2B 10.9% 5.0 11.2% 4.7
3B 10.8% 5.3 11.1% 5.7
SS 10.3% 4.8 10.8% 3.7
LF 11.2% 6.2 11.4% 5.0
CF 11.4% 5.2 11.5% 4.7
RF 11.0% 5.6 11.3% 4.3
DH 11.2% 6.1 - -
PH 2.0% 4.9 4.2% 3.7

TOTAL 5.5 4.5


In total, AL hitters were hit at a rate 20.8% higher than NL hitters.

As you can see, in the AL designated hitters consumed 11.2% of the plate appearances, and were hit at a rate of 6.1 times per 1,000 PA. Both totals are among the highest for AL hitters. So, while the DH might be the equivalent of someone else taking Wilt's free throws, the price the DH pays is some additional pain.

On the other side of the fence, NL pitchers consumed just 6.8% of the plate appearances, and were hit just 2.2 times per 1,000 PA. Interestingly, although the percentage of plate appearances for AL pitchers is rounded to 0%, they actually came to the plate 79 times, mostly as the result of games where the AL team lost their DH as a result of the DH assuming a defensive position per rule 6.10.

So, rather than seeing Freeman's 12.5% more "true" hitters in the AL, in actuality AL pitchers see around 7% more true hitters when you subtract the pitchers from the NL totals. However, Freeman also noted that pinch-hitters are often used for pitchers in the NL, and this is borne out by the fact that pinch-hitters came to the plate more than twice as often in the NL (4.2%) than in the AL (2.0%). Freeman also speculated that pinch-hitters are not as likely to get hit since they are often weaker hitters than players in the regular lineup (it should be noted that as reported in The Book, there is also a "pinch-hitting penalty" that drags down performance). The lesser rate of hit batsmen for pinch hitters is verified by the data. So, assuming that the NL rate of pinch-hitting was the same as the AL rate, and throwing the remainder of the NL pinch-hitters into the bucket of poor hitters with the pitchers, we can estimate that the AL pitchers see approximately 9% more true hitters than pitchers.

The difference between Freeman's estimate and the actual numbers lies in the fact that the vast majority of pitchers hit ninth, Dontrelle Willis being the most recent occasional exception. Hitting from the last slot in the order, pitchers therefore come to the plate less frequently than position players.

To adjust for that, given the data in the above table, we can now make an estimate for the true differences in hit batsmen by controlling for pitcher plate appearances. One simple way to do this is to estimate what would happen if all pitcher and pinch-hitter plate appearances in the NL were consumed by a true hitter whose rate of getting hit was relatively as high as a designated hitter's in the AL. This means that 11% of the NL plate appearances (6.8% + 4.2%) will be assigned a new HBP rate based on the difference between a DH and the rest of the positions in the AL. To do so we'll first calculate the ratio of the DH rate (6.1) to the non-DH rate (5.4) as 1.13. If we assume that true hitters in the NL consuming those plate appearances would have produced 13% more hit by pitches than the non-pitchers and pinch-hitters (which turns out to be 5.4 HBP/1000 PA), then the average for the NL would jump 30% from 4.5 hit batsmen per 1,000 PA to 4.8. As a result, instead of a 20.8% advantage for the AL during the period, the true advantage is around 13.6%.

So while accounting for a different lineup composition in the AL helps level the playing field, it obviously doesn't account for the entire difference, as Freeman concluded. We're still left with around two-thirds of our original difference between the leagues. Does that mean we're left with the moral hazard theory to explain the remaining difference?

Readers familiar with this subject will note that this cursory analysis lines up nicely with the fine work done by J.C. Bradbury and Douglas Drinen in a paper titled "Identifying Moral Hazard: A Natural Experiment in Major League Baseball" (warning: .pdf). In that paper, using data from 1989-1992 compared against 1969 plus 1972-1974, the authors conclude that:

"Controlling for variables that proxy batter quality, pitcher quality, retaliation, and game situation we find that the DH rule increases the likelihood that any batter will be hit during a plate appearance between 11 and 17 percent. This explains approximately 60 to 80 percent of the differential in the hit batsmen rate between leagues."

But there are also two additional theories to consider.

If you look back at the previous articles in this series you'll notice that the rate of hit batsmen in the AL actually surpasses that of the NL prior to the introduction of the DH. In fact, beginning in 1967, the rate of AL hit batsmen to NL went as follows:


1967 11.5%
1968 18.1%
1969 -1.5%
1970 10.1%
1971 8.3%
1972 11.0%


During this six-year period the differences in the AL rate with the pitcher hitting were not much different than those immediately after the introduction of the DH. What this indicates is that hit batsmen were already more frequent in the junior circuit. Perhaps some of this remaining difference lies elsewhere.

As mentioned last week, one of the factors that may influence hit batsmen is the definition (both written and as interpreted) of the strike zone. There is of course anecdotal evidence that the strike zone varied in the two leagues primarily as the result of AL umpires using the old-style "balloon" chest protector that forced them to stand more upright and therefore call more high strikes. And although by around 1983 AL umpires were also using the inside chest protector popularized by Bill Klem, they may have retained their traditional strike zone for some years. But still, outside of concocting what Stephen Jay Gould would call a "just-so story," there is no clear connection between high strikes and hit batsmen. A related hypothesis might be that the AL, being known as more of a curveball league, induced more hit batsmen since curveballs are inherently more difficult to control than fastballs. But both of these theories are difficult to quantify.

A more straightforward idea is that one or two individuals skewed the numbers for this time period, accounting for the remaining difference between the leagues. This follows the dictum that when what you're measuring has inherently low frequencies, you should always be aware of a small number of samples having a large influence on the data.

As most readers have already guessed, when you're talking about hitters and HBPs during this period, Don Baylor and Chet Lemon are two players who immediately spring to mind. Both played their entire careers in the AL, with Baylor suiting up for the Orioles, A's, Angels, Yankees, Red Sox, and Twins from 1970-88, and Lemon for the White Sox and Tigers from 1975-90. Baylor was hit 257 times in 8,888 plate appearances (defined simply as hits plus walks plus HBP for this analysis) from 1973 through 1988, for an astounding rate of 28.9 per 1,000 PA--tops during the period and ranking him 15th for players since 1901. Lemon was hit 151 times in 7,768 PA for a rate of 19.4. If these two players' rates are adjusted down to the average for the period, the overall rate for the AL drops from 5.5 to 5.3 and therefore accounts for about 4% of the remaining difference.

In summary then, from an initial difference of nearly 21% in the rate of hit batsmen between the two leagues in the 1973-1993 period, just over 7% can be accounted for by the presence of more true hitters in the lineup and another 4% by two hitters who were exceptionally "gifted" at getting plunked. This still leaves ample room for the moral hazard theory, a theory that incorporates differences in the two leagues relating to strike zone or styles of play, or a combination of all of the above to operate.

Evening the Score
The second question introduced above is related to the disappearance of the difference in rate of hit batsmen between the two leagues, beginning in 1994. Since that time, the National League has actually topped the American League in five of the twelve years, as shown in the previous table.

What can account for this dramatic shrinking of differences between the two leagues?

First, let's take a look at the same table for the years 1994-2005 as we did for the preceding years.


<----AL----> <----NL----->
HBP / HBP /
POS PAPct 1000 PAPct 1000
------------------------------------
P 0.4% 1.4 5.9% 3.3
C 10.1% 11.1 10.5% 12.8
1B 11.1% 10.7 11.2% 10.5
2B 11.0% 10.6 11.5% 13.1
3B 10.8% 9.3 11.1% 9.8
SS 10.9% 10.3 11.1% 8.5
LF 11.2% 8.9 11.3% 10.5
CF 11.3% 8.5 11.5% 10.0
RF 11.0% 9.8 11.3% 10.7
DH 10.4% 10.2 0.5% 12.2
PH 1.6% 8.4 4.2% 10.0

TOTAL 9.9 10.3


What you'll notice is that the NL has outpaced the AL since 1994 despite leading in a minority of those seasons. This data set now includes interleague games, so a DH is listed in the NL column, and pitchers in the AL with the rate of hit batsmen for NL DHs even higher than that for the NL, and the rate for AL pitchers lower than in the NL. Of course, both leagues saw massive increases in their rates reflected as well.

In a follow-up paper (another .pdf) also published in 2004 Bradbury and Drinen conclude that during the entire history of the DH, batters were about 8% more likely to be hit in games where the DH was played accounting for around half of the difference between the leagues. However, when looking only at 1994-2005 data and breaking down the data into games played with the DH and those without we find the following:


<----DH----> <---NO DH---->
HBP / HBP /
POS PAPct 1000 PAPct 1000
-------------------------------------
P 0.0% 0.0 6.2% 2.8
C 10.1% 10.0 10.5% 11.0
1B 11.1% 9.6 11.2% 9.0
2B 11.0% 9.6 11.5% 11.2
3B 10.8% 8.4 11.1% 8.4
SS 10.8% 9.1 11.1% 7.5
LF 11.2% 8.1 11.3% 9.0
CF 11.3% 7.8 11.6% 8.5
RF 11.0% 8.8 11.3% 9.2
DH 11.2% 9.0 - -
PH 1.4% 7.9 4.3% 8.6

TOTAL 8.9 8.8


Here there is only a 1% overall difference. If one were to "correct" the data to account for lineup composition, as we did with the 1973-1993 data, you would find that games in which the DH was not in force produced 8.1% more hit batsmen per 1000 plate appearances than games without the DH. Truly, this is a large shift, for which we can offer three possible explanations.

First, as with the 1973-1993 data, we may be seeing the influence of one or several extreme players. It just so happens that during this period the NL has been blessed with a trio of the most-frequently hit batters in the history of baseball in Jason Kendall (except 2005), Craig Biggio, and Fernando Vina (except 1995-1997 and 2004). A clue to their contribution can be seen in the previous table, where the rates for second baseman and catchers are conspicuously high in the NL. Overall, their rates during that time…


PA HBP HBP/1000
---------------------------
Kendall 5908 197 33.3
Vina 4633 154 33.2
Biggio 7930 245 30.9


Don Baylor has nothing on these guys.
If we adjust these three players' rates down to the league average for the period it drops the overall NL rate 4.3%, down to 9.8, just under the AL rate. Even so, this doesn't fully account for the fact that, given the lineup composition theory, we should see even fewer hit batsmen in the NL.

A second theory, and one proposed by Bradbury and Drinen in their follow-up paper, targeted the expansions of 1993 and 1998 as possible factors. Although discussed in the first article in this series, this theory does accurately predict a larger increase in HBP in the NL than in the AL in 1993-1994 because of the asymmetrical nature of the expansion draft. In 1993, the HBP rate rose 7.3% in the AL and 21.6% in the NL, and for 1994 it was -6.0% and 11.6%. In the years following 1994 the rate increases evened out. But even so, one wouldn't think that NL pitchers would go on hitting more batters even after the affects of expansion were absorbed as they did in 1997 and 2000.

The final theory, and one also proposed by Bradbury and Drinen, is that the implementation of the "double-warning rule" (8.02(d)) in the winter of 1993 had an immediate impact. Essentially, this rule raised the costs for teams hitting opposing batters, and placed that cost squarely on the pitcher and manager, both of whom can be immediately ejected from the game. One result is that AL pitchers now have a greater fear of hitting batters in retaliation lest they be ejected, thereby lowering their rate of hit batsmen. At the same time, it could be argued (as Brady and Drinen do) that NL pitchers have less fear of retaliation under the double-warning rule, since they know that the opposing team dare not hit them or their teammates or suffer the cost. The combination of more fear by AL pitchers and less fear by NL pitchers could together be responsible for essentially erasing the gap between the leagues.

Take Your Base
One of the reasons so many of us love baseball is that while it is seemingly simple, it is also a very human activity with naturally endless complexity. In this series of articles, I hope that we've highlighted some of that complexity in a statistically small but interesting part of the game. But while I for one love big-picture analysis, there's nothing more exciting than getting caught up in the one-on-one confrontations between pitcher and batter that are really the source of our ruminations.