FREE hit counter and Internet traffic statistics from freestats.com
Showing posts with label Hit Batsmen. Show all posts
Showing posts with label Hit Batsmen. Show all posts

Saturday, January 26, 2008

The Moral Hazards of the Hit Batsmen

This the final in a series of three columns I wrote for BP on the topic of hit batsmen. You can find the other two on this blog as well. It appeared on May 18, 2006



Schrodinger's Bat:The Moral Hazards of the Hit Batsmen
by Dan Fox

"The designated hitter rule is like letting someone else take Wilt Chamberlain's free throws."

--Rick Wise (1974)

In the previous two weeks, we’ve been looking at historical hit by pitch rates and their trends, and investigating a variety of theories that have tried to explain the fluctuation of those rates. We’ve looked at a wide variety of theories that account for factors such as aluminum bats at the amateur level, changes in the strike zone, the increase in body armor, intimidation, retaliation, and even the win expectancy of the hit batsmen. While individual theories may lack explanatory power for a specific period of time, taken together they do provide insight into the sometimes opposing forces that underlie trends in baseball's complex competitive environment.

There is one trend, however, that we failed to discuss. So this week we’ll take a look at the difference in league rates of hit batsmen since the introduction of the designated hitter in 1973. This topic has been taken up before, so we’ll start by covering some of the old ground, and then hopefully add something new to the discussion.

Setting a Baseline
Before we discuss what the impact of the DH on HBP rates might be, let’s lay out the raw facts that have inspired so much conjecture. The following graph shows the percentage of AL hit batsmen per 1,000 plate appearances as opposed to the NL since the DH was adopted in the American League in 1973. The shaded line is a three-year moving average.



What this shows is that from 1973 until the mid 1990s the rate of hit batsmen in the AL was anywhere between 3% and 30% higher than in the NL. While that’s a wide range, more typical values are between 10% and 20%, with the average during the period being 17%:


1973 9.3% 1990 19.0%
1974 14.0% 1991 18.6%
1975 7.8% 1992 20.0%
1976 17.3% 1993 9.4%
1977 17.2% 1994 -7.6%
1978 12.5% 1995 -6.7%
1979 8.4% 1996 2.5%
1980 24.2% 1997 -15.9%
1981 22.8% 1998 4.4%
1982 3.2% 1999 10.2%
1983 19.1% 2000 -17.6%
1984 29.7% 2001 7.0%
1985 21.5% 2002 7.4%
1986 26.5% 2003 3.4%
1987 16.7% 2004 7.7%
1988 20.9% 2005 -5.6%
1989 22.9%


Around 1994, things began to change and in the following dozen years HBP rates in the NL actually surpassed those in the AL five times, including in 2005 where 9.52 batters were hit per 1,000 PA in the AL, against 10.05 in the NL.


So in fact, there are actually two questions that we can ask about this trend. First, what accounts for the difference in rates of hit batsmen during the twenty-year period following the introduction of the DH (1973-1993), and secondly, what caused those differences to shrink in the period after 1993?

A Moral Hazard or More Opportunity?
As mentioned in the introduction, the topic of league differences in HBP rate have been researched in the past. Most recently, Lee A. Freeman wrote an excellent article titled "The Effect of the Designated Hitter Rule on Hit Batsmen" in Volume 33 of The Baseball Research Journal. In it, Freeman provided a short synopsis of the previous work, citing articles in the journal Economic Inquiry in 1997 and 1998, as well as a follow-up in a 2004 issue of the Journal of Sports Economics.

Prior to Freeman’s paper the two theories that had been bandied about to explain the difference (at least from 1973 until the mid 1990s) were the "moral hazard theory" and the "lineup composition theory." The former theory argues that because American League pitchers needn’t fear retaliation with the presence of the DH, they are more apt to hit opposing batters since they don’t bear the costs of their actions directly. The latter theory also argues from a cost-benefit basis, although differently--AL pitchers hit more batters because the cost in terms of run scoring when hitting a DH is so much less than hitting a pitcher. This follows from the fact that the designated hitter is much more likely to be an offensive producer than your typical weak-hitting full-time hurler.

As a variation of the lineup composition theory, Freeman contended that more hit batsmen in the AL can be explained largely (but not totally, as he rightly cautions against single-theory explanations) simply by more "true" hitters coming to bat in the AL. In his words:


American League pitchers are not given the opportunity during a game to 'ease up' their delivery to the opposing pitcher. As a result, AL pitchers are likely to 'want' or 'need' to pitch inside to more batters during the course of a game, thereby increasing the chances of these batters being hit by a pitch.

Through an analysis of average HBP per season and per team, both before and after the introduction of the DH, Freeman concludes that there is no statistical significance (at the .001 or .005 levels) to the differences in hit batsmen across the two leagues once you adjust the averages for the fact that in the AL approximately 12.5% more true hitters come to the plate in the DH era.

What this analysis lacks, as admitted by Freeman himself, is a more granular accounting for the differences in the number of "true" hitters, and instead relies on a quick and dirty approximation. Using Retrosheet data, we can address that weakness in the study.

The following table shows the percentage of plate appearances consumed by each fielding position, along with the HBP per 1,000 plate appearances for both the AL and NL in the period 1973-1993.


<----AL----> <----NL----->
HBP / HBP /
POS PAPct 1000 PAPct 1000
------------------------------------
P 0.0% 0.0 6.8% 2.2
C 10.1% 6.3 10.4% 4.9
1B 11.1% 4.9 11.3% 4.8
2B 10.9% 5.0 11.2% 4.7
3B 10.8% 5.3 11.1% 5.7
SS 10.3% 4.8 10.8% 3.7
LF 11.2% 6.2 11.4% 5.0
CF 11.4% 5.2 11.5% 4.7
RF 11.0% 5.6 11.3% 4.3
DH 11.2% 6.1 - -
PH 2.0% 4.9 4.2% 3.7

TOTAL 5.5 4.5


In total, AL hitters were hit at a rate 20.8% higher than NL hitters.

As you can see, in the AL designated hitters consumed 11.2% of the plate appearances, and were hit at a rate of 6.1 times per 1,000 PA. Both totals are among the highest for AL hitters. So, while the DH might be the equivalent of someone else taking Wilt's free throws, the price the DH pays is some additional pain.

On the other side of the fence, NL pitchers consumed just 6.8% of the plate appearances, and were hit just 2.2 times per 1,000 PA. Interestingly, although the percentage of plate appearances for AL pitchers is rounded to 0%, they actually came to the plate 79 times, mostly as the result of games where the AL team lost their DH as a result of the DH assuming a defensive position per rule 6.10.

So, rather than seeing Freeman's 12.5% more "true" hitters in the AL, in actuality AL pitchers see around 7% more true hitters when you subtract the pitchers from the NL totals. However, Freeman also noted that pinch-hitters are often used for pitchers in the NL, and this is borne out by the fact that pinch-hitters came to the plate more than twice as often in the NL (4.2%) than in the AL (2.0%). Freeman also speculated that pinch-hitters are not as likely to get hit since they are often weaker hitters than players in the regular lineup (it should be noted that as reported in The Book, there is also a "pinch-hitting penalty" that drags down performance). The lesser rate of hit batsmen for pinch hitters is verified by the data. So, assuming that the NL rate of pinch-hitting was the same as the AL rate, and throwing the remainder of the NL pinch-hitters into the bucket of poor hitters with the pitchers, we can estimate that the AL pitchers see approximately 9% more true hitters than pitchers.

The difference between Freeman's estimate and the actual numbers lies in the fact that the vast majority of pitchers hit ninth, Dontrelle Willis being the most recent occasional exception. Hitting from the last slot in the order, pitchers therefore come to the plate less frequently than position players.

To adjust for that, given the data in the above table, we can now make an estimate for the true differences in hit batsmen by controlling for pitcher plate appearances. One simple way to do this is to estimate what would happen if all pitcher and pinch-hitter plate appearances in the NL were consumed by a true hitter whose rate of getting hit was relatively as high as a designated hitter's in the AL. This means that 11% of the NL plate appearances (6.8% + 4.2%) will be assigned a new HBP rate based on the difference between a DH and the rest of the positions in the AL. To do so we'll first calculate the ratio of the DH rate (6.1) to the non-DH rate (5.4) as 1.13. If we assume that true hitters in the NL consuming those plate appearances would have produced 13% more hit by pitches than the non-pitchers and pinch-hitters (which turns out to be 5.4 HBP/1000 PA), then the average for the NL would jump 30% from 4.5 hit batsmen per 1,000 PA to 4.8. As a result, instead of a 20.8% advantage for the AL during the period, the true advantage is around 13.6%.

So while accounting for a different lineup composition in the AL helps level the playing field, it obviously doesn't account for the entire difference, as Freeman concluded. We're still left with around two-thirds of our original difference between the leagues. Does that mean we're left with the moral hazard theory to explain the remaining difference?

Readers familiar with this subject will note that this cursory analysis lines up nicely with the fine work done by J.C. Bradbury and Douglas Drinen in a paper titled "Identifying Moral Hazard: A Natural Experiment in Major League Baseball" (warning: .pdf). In that paper, using data from 1989-1992 compared against 1969 plus 1972-1974, the authors conclude that:

"Controlling for variables that proxy batter quality, pitcher quality, retaliation, and game situation we find that the DH rule increases the likelihood that any batter will be hit during a plate appearance between 11 and 17 percent. This explains approximately 60 to 80 percent of the differential in the hit batsmen rate between leagues."

But there are also two additional theories to consider.

If you look back at the previous articles in this series you'll notice that the rate of hit batsmen in the AL actually surpasses that of the NL prior to the introduction of the DH. In fact, beginning in 1967, the rate of AL hit batsmen to NL went as follows:


1967 11.5%
1968 18.1%
1969 -1.5%
1970 10.1%
1971 8.3%
1972 11.0%


During this six-year period the differences in the AL rate with the pitcher hitting were not much different than those immediately after the introduction of the DH. What this indicates is that hit batsmen were already more frequent in the junior circuit. Perhaps some of this remaining difference lies elsewhere.

As mentioned last week, one of the factors that may influence hit batsmen is the definition (both written and as interpreted) of the strike zone. There is of course anecdotal evidence that the strike zone varied in the two leagues primarily as the result of AL umpires using the old-style "balloon" chest protector that forced them to stand more upright and therefore call more high strikes. And although by around 1983 AL umpires were also using the inside chest protector popularized by Bill Klem, they may have retained their traditional strike zone for some years. But still, outside of concocting what Stephen Jay Gould would call a "just-so story," there is no clear connection between high strikes and hit batsmen. A related hypothesis might be that the AL, being known as more of a curveball league, induced more hit batsmen since curveballs are inherently more difficult to control than fastballs. But both of these theories are difficult to quantify.

A more straightforward idea is that one or two individuals skewed the numbers for this time period, accounting for the remaining difference between the leagues. This follows the dictum that when what you're measuring has inherently low frequencies, you should always be aware of a small number of samples having a large influence on the data.

As most readers have already guessed, when you're talking about hitters and HBPs during this period, Don Baylor and Chet Lemon are two players who immediately spring to mind. Both played their entire careers in the AL, with Baylor suiting up for the Orioles, A's, Angels, Yankees, Red Sox, and Twins from 1970-88, and Lemon for the White Sox and Tigers from 1975-90. Baylor was hit 257 times in 8,888 plate appearances (defined simply as hits plus walks plus HBP for this analysis) from 1973 through 1988, for an astounding rate of 28.9 per 1,000 PA--tops during the period and ranking him 15th for players since 1901. Lemon was hit 151 times in 7,768 PA for a rate of 19.4. If these two players' rates are adjusted down to the average for the period, the overall rate for the AL drops from 5.5 to 5.3 and therefore accounts for about 4% of the remaining difference.

In summary then, from an initial difference of nearly 21% in the rate of hit batsmen between the two leagues in the 1973-1993 period, just over 7% can be accounted for by the presence of more true hitters in the lineup and another 4% by two hitters who were exceptionally "gifted" at getting plunked. This still leaves ample room for the moral hazard theory, a theory that incorporates differences in the two leagues relating to strike zone or styles of play, or a combination of all of the above to operate.

Evening the Score
The second question introduced above is related to the disappearance of the difference in rate of hit batsmen between the two leagues, beginning in 1994. Since that time, the National League has actually topped the American League in five of the twelve years, as shown in the previous table.

What can account for this dramatic shrinking of differences between the two leagues?

First, let's take a look at the same table for the years 1994-2005 as we did for the preceding years.


<----AL----> <----NL----->
HBP / HBP /
POS PAPct 1000 PAPct 1000
------------------------------------
P 0.4% 1.4 5.9% 3.3
C 10.1% 11.1 10.5% 12.8
1B 11.1% 10.7 11.2% 10.5
2B 11.0% 10.6 11.5% 13.1
3B 10.8% 9.3 11.1% 9.8
SS 10.9% 10.3 11.1% 8.5
LF 11.2% 8.9 11.3% 10.5
CF 11.3% 8.5 11.5% 10.0
RF 11.0% 9.8 11.3% 10.7
DH 10.4% 10.2 0.5% 12.2
PH 1.6% 8.4 4.2% 10.0

TOTAL 9.9 10.3


What you'll notice is that the NL has outpaced the AL since 1994 despite leading in a minority of those seasons. This data set now includes interleague games, so a DH is listed in the NL column, and pitchers in the AL with the rate of hit batsmen for NL DHs even higher than that for the NL, and the rate for AL pitchers lower than in the NL. Of course, both leagues saw massive increases in their rates reflected as well.

In a follow-up paper (another .pdf) also published in 2004 Bradbury and Drinen conclude that during the entire history of the DH, batters were about 8% more likely to be hit in games where the DH was played accounting for around half of the difference between the leagues. However, when looking only at 1994-2005 data and breaking down the data into games played with the DH and those without we find the following:


<----DH----> <---NO DH---->
HBP / HBP /
POS PAPct 1000 PAPct 1000
-------------------------------------
P 0.0% 0.0 6.2% 2.8
C 10.1% 10.0 10.5% 11.0
1B 11.1% 9.6 11.2% 9.0
2B 11.0% 9.6 11.5% 11.2
3B 10.8% 8.4 11.1% 8.4
SS 10.8% 9.1 11.1% 7.5
LF 11.2% 8.1 11.3% 9.0
CF 11.3% 7.8 11.6% 8.5
RF 11.0% 8.8 11.3% 9.2
DH 11.2% 9.0 - -
PH 1.4% 7.9 4.3% 8.6

TOTAL 8.9 8.8


Here there is only a 1% overall difference. If one were to "correct" the data to account for lineup composition, as we did with the 1973-1993 data, you would find that games in which the DH was not in force produced 8.1% more hit batsmen per 1000 plate appearances than games without the DH. Truly, this is a large shift, for which we can offer three possible explanations.

First, as with the 1973-1993 data, we may be seeing the influence of one or several extreme players. It just so happens that during this period the NL has been blessed with a trio of the most-frequently hit batters in the history of baseball in Jason Kendall (except 2005), Craig Biggio, and Fernando Vina (except 1995-1997 and 2004). A clue to their contribution can be seen in the previous table, where the rates for second baseman and catchers are conspicuously high in the NL. Overall, their rates during that time…


PA HBP HBP/1000
---------------------------
Kendall 5908 197 33.3
Vina 4633 154 33.2
Biggio 7930 245 30.9


Don Baylor has nothing on these guys.
If we adjust these three players' rates down to the league average for the period it drops the overall NL rate 4.3%, down to 9.8, just under the AL rate. Even so, this doesn't fully account for the fact that, given the lineup composition theory, we should see even fewer hit batsmen in the NL.

A second theory, and one proposed by Bradbury and Drinen in their follow-up paper, targeted the expansions of 1993 and 1998 as possible factors. Although discussed in the first article in this series, this theory does accurately predict a larger increase in HBP in the NL than in the AL in 1993-1994 because of the asymmetrical nature of the expansion draft. In 1993, the HBP rate rose 7.3% in the AL and 21.6% in the NL, and for 1994 it was -6.0% and 11.6%. In the years following 1994 the rate increases evened out. But even so, one wouldn't think that NL pitchers would go on hitting more batters even after the affects of expansion were absorbed as they did in 1997 and 2000.

The final theory, and one also proposed by Bradbury and Drinen, is that the implementation of the "double-warning rule" (8.02(d)) in the winter of 1993 had an immediate impact. Essentially, this rule raised the costs for teams hitting opposing batters, and placed that cost squarely on the pitcher and manager, both of whom can be immediately ejected from the game. One result is that AL pitchers now have a greater fear of hitting batters in retaliation lest they be ejected, thereby lowering their rate of hit batsmen. At the same time, it could be argued (as Brady and Drinen do) that NL pitchers have less fear of retaliation under the double-warning rule, since they know that the opposing team dare not hit them or their teammates or suffer the cost. The combination of more fear by AL pitchers and less fear by NL pitchers could together be responsible for essentially erasing the gap between the leagues.

Take Your Base
One of the reasons so many of us love baseball is that while it is seemingly simple, it is also a very human activity with naturally endless complexity. In this series of articles, I hope that we've highlighted some of that complexity in a statistically small but interesting part of the game. But while I for one love big-picture analysis, there's nothing more exciting than getting caught up in the one-on-one confrontations between pitcher and batter that are really the source of our ruminations.

Saturday, January 12, 2008

Strike Zones, Trilobites, and a Vicious Cycle

Last week I ran the first in a series of three columns I wrote on hit batsmen. Today it's time for the second in the series originally published In May of 2006. Enjoy.




May 11, 2006
Schrodinger's Bat: Strike Zones, Trilobites, and a Vicious Cycle
by Dan Fox

"If they knocked two of our guys down, I'd get four. You have to protect your hitters."
--Don Drysdale

"I hated to bat against Drysdale. After he hit you he'd come around, look at the bruise on your arm and say, 'Do you want me to sign it?'"
-- Mickey Mantle

In our last installment of Schrödinger’s Bat we began an investigation of hit batsmen by looking at the big-picture trends in the rate of hit batsmen since 1901. That exploration led to summarizing various theories that have been proposed over the years to explain the fluctuation of rates, including the physical hazard theory, the offensive context theory, the intimidation theory, the expansion theory, the new strike zone theory, and finally the aluminum theory. From among that group, we can say that the last one seemed to make sense for the recent upward trend that began circa 1985.

Although I promised that this week we’d scrutinize the differences in hit batsmen rates since the introduction of the designated hitter in 1973, and discuss the theories proposed to explain it, last week’s column generated such a large volume of email that I thought it would be worth spending one more column on the big picture before moving on to the DH era.

Big Picture Trends Redux
Let’s start off by addressing a few of the more prevalent reader questions regarding the bevy of big picture trends discussed last week. Indicative of the questions received was this one from reader Marc Stone, where Marc touches on two aspects of HBP trends that the article overlooked.

Nice job, Dan, but you left out one very useful comparison: how do changes in HBP compare to changes in BB rates and, to a lesser extent, K rates and pitches per PA.

Reader Ryan Tippetts echoed the second part of that question by noting:

My immediate thought, specifically regarding recent upward trends, was the modern trend of increased pitches per AB. Might it be as simple as because a batter sees more pitches he has more opportunities to be hit by a pitch?

Thanks to Ryan and Marc, and to all the other readers who had similar comments. I have to admit that neither looking at walk and strikeout rates nor at pitches per plate appearance in comparison with the rate of hit batsmen had occurred to me. But of course all three suggestions make a lot of sense:


  • If pitchers are walking more batters at the same time they’re hitting more of them, that may be indicative of worse control (the “wildness theory”).

  • If strikeouts are strongly correlated with hit batsmen, then perhaps a more aggressive hitting style (the “free swinger theory”), or the intimidation of the HBP, or even changes in the strike zone are playing a role.

  • If pitchers are throwing more pitches overall, it does indeed provide more opportunity for hitters to get plunked (the “opportunity theory”) which in the end may be all that is required.


To see whether the wildness or free swinger theories shed any light on the question of changes in HBP rates over time, we can add unintentional walks and strikeouts per 1,000 plate appearances for each league to the graph we showed last week:



What you’ll notice is that up until around 1970, there appears to be some correlation between walk rate and HBP rate. Unfortunately, the correlation is the inverse of that which the wildness theory would predict. As walk rates increased from around 1920 through the late 1940s the rate of hit batsmen fell. As walk rates declined, the frequency with which batters were hit increased.

In other words, one might be inclined to conclude that there is a more or less constant rate at which pitchers put batters on for free via the HBP or unintentional walk, at least based on the graph from 1901 through 1970. While that’s an attractive idea, and akin to the offensive context theory discussed last week, you can’t simply add the two rates, since hit batsmen are so much less frequent than walks--as evidenced by the fact that in order to get both on the graph, the scale of HBP is per 1,000 PA while that for walks is per 100 PA. As a result, the number of runners that pitchers put on for free is driven almost entirely by the number of walks.

In any case, there appears to be no correlation over the past 35 years, as walk rates have been fairly steady, while the number of hit batsmen has increased dramatically.

On the other hand, the free-swinger theory appears more promising. Strikeout rate does correlate pretty strongly with the HBP rate since around 1950, and in the 1910-1925 period as well. In fact, from 1950 through 2005 the correlation coefficients are a very healthy .72 and .69 for the American and National Leagues respectively, which can be interpreted to mean that strikeout rates explain around 50% (.702) of the variation in HBP rates (or vice versa).

But as every statistics professor drums into the heads of his students, correlation is not necessarily causation, and before 1950 the correlation is much weaker--in fact, for the preceding 25 years the two rates were moving in opposite directions. As a result, one might argue that the free-swinger theory holds since 1950 because the normative hitting style became more aggressive, resulting in hitters diving over the plate more frequently, which in turns results in more hit batsmen. Under this interpretation, during the 1970-1984 period, free swinging was less in vogue, and pitchers reacted with fewer brushback pitches, resulting in fewer HBP.

An alternative theory noted by reader JMHawkins that would fit the same set of facts holds that an expanding strike zone, especially on the outside corner, forces hitters to stand closer to the plate and dive over it more frequently, resulting in more batters being hit. The expanded zone also happens to induce more strikeouts, so strikeout rate and HBP rate aren’t causally related, but both are related to this third factor. There is undisputed evidence that the strike zone expanded in 1963, and anecdotal evidence that the low outside corner became an increasingly rewarding target for pitchers in the last 20 years or so. As umpires reigned in the zone after the redefinition in 1969 and the increased scrutiny around 2001, both strikeouts and hit batsmen fell. This “fluctuating strike zone theory” then explains why strikeout and HBP rate seem to mirror each other.

In either case, we’d still need a theory to account for the preceding 25 years, when strikeouts rose and hit batsmen fell, although under the above theory it appears that those 25 years from 1925 to around 1950 are the exception and not the rule.

To be honest, I was initially most hopeful about the opportunity theory. It's pretty well known that the number of pitches per plate appearance has been on the rise, so it makes intuitive sense, but when we try to look at this theory, we run into the problem that we don’t have complete play-by-play data--and hence pitch counts--for most of baseball's history. Despite the recent and very welcome additions to the work being done at Retrosheet we are still missing the vast majority of the data required to complete the picture from 1901 through 2005; the 49 seasons that Retrosheet provides are often missing pitch sequence data.

Some alert readers (aka, the real stat geeks) may also be thinking that perhaps we could use pitch count estimators in order to estimate the number of pitches, and hence the rate at which batters are hit per pitch. Unfortunately, the basic estimators that are in use rely on constant multipliers for strikeouts and walks to estimate the number of pitches, and we’ve already taken those into account in the graph above. More complex estimators rely on estimates of balls-in-play rate (the percentage of pitches on which balls are put into play, which varies by league and year), which we don’t have historically. There are other factors that could also influence the result which models have difficulty capturing.

However, we can look at data we do have, and that's as far back as 1988. You’ll recall that during the 1988-2005 period HBP rates have more than doubled. What we find, however, is that during that time the number of pitches per plate appearance has risen only around 5%. So it doesn’t look like the opportunity theory explains at least the most recent upward trend.


Year P/PA
1988 3.60
1989 3.63
1990 3.64
1991 3.71
1992 3.68
1993 3.68
1994 3.75
1995 3.75
1996 3.75
1997 3.76
1998 3.70
2000 3.75
2001 3.72
2002 3.73
2003 3.74
2004 3.76
2005 3.73


What do Trilobites and Jason Kendall Have in Common?
Although the free-swinger and fluctuating strike zone theories (or some combination thereof) provides some insight, and the opportunity and wildness theories perhaps less so, the most often cited theory by readers not discussed in last week’s column is the “body armor theory.” A succinct explanation was provided by reader Jeff Bullington:

This would only affect the recent rise, but what about the increased use of body armor? Would this be the 'contra-intimidation theory'?

As Jeff noted, this is the polar opposite of the intimidation theory and holds that as hitters began to wear more and more protective gear, they’ve been less afraid of getting hit, allowing them to stand closer to the plate and be more aggressive about hanging in. It follows logically that pitchers would respond by upping the ante in an effort to move batters off the plate, and reclaim their rightful territory.

This idea is akin to the evolutionary arms race between predator and prey, whereby one species evolves stronger protection in response to selection pressure from predators as has been speculated for trilobites, which in turn leads to selection pressure on predators to evolve accordingly.

As arguments go, this is a particularly difficult one to measure quantitatively. What we can certainly see that the use of protective gear--such as hard elbow and shin pads--has increased in the past 20 years. One only has to look at the protection worn by Craig Biggio, or Jason Kendall and consider his recent run-in with John Lackey to understand how that protection might affect the game. It’s probably not a coincidence that coming into 2006, Biggio's 273 HBPs rank second all-time, and Kendall ranks 8th with 197.

That said, in 2002 Major League Baseball began enforcing rules that limited the use of protective gear to players with medical exemptions, such as the one employed by Barry Bonds, which allows him to wear his elbow armor. The rules also limited the size of the various pads and devices worn.

Whether coincidentally or not, the recent Kendall incident notwithstanding, the rate of hit batsmen has stabilized since that time. This was also immediately after the rate had reached its apogee in 2001, when the AL set its all-time record in hit batsmen per 1,000 plate appearances and the NL its highest total since 1901.


AL NL
2001 10.67 9.92
2002 9.90 9.17
2003 10.21 9.86
2004 10.40 9.60
2005 9.52 10.05


We can also note that although helmets have been mandatory for MLB players since 1956, ear flaps have only been enforced for players who reached the majors after 1983. Ear flaps do coincide with the recent upward trend, and although one can imagine there would be an attendant psychological boost for the hitter, it’s more difficult to believe that this relatively minor change would have had that large of an immediate impact. After all, players already in the league were allowed to use the old-style helmets, so the change was gradually phased in, and the head is the part of the body hit with the least frequency.

But this does provide the opportunity to sneak in a quick trivia question: Who was the last player to wear a helmet without an earflap in a game and in what year? (Wait for it, we'll get to the answer at the bottom of the column.)

So, whether or not body armor and the introduction of the ear flap is responsible for the twenty-year upward trend in HBP rates or not, an argument can be made that the crackdown on body armor has played a role in retarding the arms race.

A Vicious Circle?
Finally, reader Jake Slemp wrote to say that whatever the cause of an increasing or decreasing trend in hit batsmen, it would likely be self-sustaining and reinforcing. His reasoning:

After all, hit batsmen beget more hit batsmen within the same game, which often beget still more in subsequent games between the two teams…which beget more in those games, etc.


In other words, even a small increase in hit batsmen might form a feedback loop based on retaliation. This situation is often described in economic terms as a virtuous (if the results are favorable) or a vicious (if they are negative) circle, where each cycle continues the trend in the current direction until stopped by some outside force.

To look at this “vicious circle theory,” we can use play-by-play data for 2001 through 2005 to examine the distribution of games by the number of hit batsmen. We can then compare the actual distribution with what would be expected if the hit batsmen were distributed randomly (in a binomial distribution) given the overall rate of HBP and the average number of plate appearances per game. What we find when we do so is as follows:


HBP Games Expected
7 1 0
6 1 1
5 10 10
4 118 71
3 455 394
2 1626 1610
1 3980 4325
0 5953 5732
6191 6412


As you can see, the number of games where zero through two batters are hit are all pretty much in line with what would be expected. However, we do see that the frequency of three and especially four batters hit in a game surpass the numbers you'd expect, and there are fewer games with a single batter hit than expected. And of course this list provides the opportunity for a second trivia question: What teams were involved in the lone seven hit batsmen game of the past five years? (Again, answer appears at the bottom.)

What this confirms is that retaliation is a likely factor in hit batsmen. Games where we would otherwise expect two batters to be hit can quickly turn into games where three or four are hit. We already knew that intuitively, but what we need to know is whether or not increased retaliation is responsible for the increasing number of hit batsmen.

To look at this, we can calculate the expected number of games with various numbers of hit batsmen over four successive periods, starting in 1985.


Actual vs 1985-1989 1990-1994 1995-2000* 2001-2005
Expected
5+ 850% 246% 322% 104%
3 - 4 162% 125% 119% 123%
0 - 2 100% 100% 99% 99%

* Does not include 1997-1999.

As we saw with the 2001-2005 period, in all periods there are just about the expected number of games with zero, one, or two HBP. However, there are always more games than expected with three or four batters hit, and lots more with five or more hit.

While this confirms that retaliation within games is probably a persistent feature of hit batsmen, it doesn’t appear as if blatant retaliation has increased over the past twenty years. Keep in mind, the HBP rate has doubled during that time frame. If anything, it would appear there are slightly fewer beanball wars now than in the past, perhaps as a result of the double-warning rule put into effect in 1994. Note that this conclusion holds even if you assume that the increase in games with three or more hit batsmen is completely due to wildness (after all, it’s certainly true that when a pitcher hits one batter he’s more likely to hit another simply due to control problems).

What this doesn’t rule out is the idea that teams now employ a more subtle form of retaliation, whereby they will wait to take revenge in a subsequent series, and where the retaliation doesn’t escalate out of control. As a result, it would be possible that retaliation and escalation are to blame for the recent increase in hit batsmen, but it seems unlikely.

However, even if retaliation is not the cause of the increasing rate of hit batsmen, the body armor theory may provide the starting point for the vicious circle that was interrupted by the new rules, starting in 2002.

Error on the Side of Caution
If nothing else, I hope that we’ve highlighted that in an activity as complex as baseball, there are usually many factors that contribute to the big-picture trends that we see. That’s true for hit batsmen as well as the more visible trends, like the offensive upsurge of the last dozen years or so. If there is a lesson to be learned here, it’s probably that we should all be more cautious of simple explanations and easy answers.

Let’s wrap up with a couple of corrections from last week.

First, when discussing the expansion theory I noted that expansion would have a tendency to dilute talent in both leagues. While that’s true to some extent, I was reminded by our own Christina Kahrl that actually the 1992 expansion draft was the first time players from both leagues were available in an expansion draft. Prior to that, for example in 1977, the expansion teams could only choose unprotected players from their own league. And in that 1992 draft, AL teams were able to protect more players than NL teams; it was not until the 1997 draft that all teams were able to protect the same number of players.

Second, I noted last week that Ray Chapman was the only professional player ever fatally injured in a game. Reader Bill Johnson pointed out that Chapman was the only major-leaguer to be fatally injured by a beanball. Several minor leaguers were killed in the 1950s and 1960s including Otis Johnson in 1951.
---

Okay, so you waited, here are a couple of answers. For the first trivia question, Tim Raines never wore an earflap in a 23-year career that spanned from 1979 through 2002. As quoted in a MLB article documenting it, he did not wear one because, being a switch hitter, he didn’t want to carry two helmets.

The answer to the second question: June 7, 2001 the A’s visited Anaheim to take on the Angels. In that game Jason Giambi was hit by Scott Schoeneweis following a first-inning home run by Frank Menechino. In the third inning, Schoenweis then hit Menechino (one wonders if accidentally) and later in the inning also hit Olmedo Saenz. Barry Zito subsequently hit Tim Salmon in the 6th. Almost certainly not coincidentally, Schoeneweis again hit Menechino leading off the 8th. Later in that same inning, Mike Holtz entered the game and promptly plunked Eric Chavez for good measure. And just to round things out Scott Spiezio was hit by Mark Guthrie in the bottom of the 8th. Ouch.

Wednesday, January 09, 2008

Beautiful Theories and Ugly Facts

Another golden oldie from the Baseball Prospectus archives originally published on May 4, 2006



Schrodinger's Bat: Beautiful Theories and Ugly Facts
by Dan Fox
May 4, 2006

“The great tragedy of Science--the slaying of a beautiful hypothesis by an ugly fact.”

--British biologist Thomas H. Huxley (1825-1895)

On April 22nd, Rockies setup man Jose Mesa drilled Giants shortstop Omar Vizquel in the back with his first pitch. The next day, Giants starter Matt Morris hit both Matt Holliday and Eli Marrero in the first eight pitches he threw and was tossed from the game, along with manager Felipe Alou and pitching coach Dave Righetti. That was followed by the customary warnings to both teams, in observance of the practice that Major League baseball adopted in 1994.

Later in the game, Jeff Francis hit Steve Finley and was not ejected, much to the consternation of what was left of the Giants coaching staff. Of course, under the double warning rule, the umpires still have discretion over whether to eject a pitcher after the warnings have been issued; a discretion that yours truly thinks is not exercised nearly as often as it should be. Finally, Ray King plunked Vizquel again in the 8th, and was ejected along with Rockies skipper Clint Hurdle.

The Mesa/Vizquel feud dates back to 1998, when the two were still teammates with the Indians and Vizquel celebrated a spring training home run off of Mesa by doing a cartwheel afterwards. Things went downhill after the 2002 publication of Vizquel’s book Omar! My Life On and Off the Field, wherein Vizquel said of Mesa’s performance in Game Seven of the 1997 World Series:

"The eyes of the world were focused on every move we made. Unfortunately, Jose's own eyes were vacant. Completely empty. Nobody home. You could almost see right through him. Not long after I looked into his vacant eyes, he blew the save and the Marlins tied the game.”

Well, at least no one can accuse Vizquel of being the model teammate.

Mesa then vowed to hit Vizquel every time he faced him, and he did exactly that on June 12, 2002, in the 9th inning of a 7-3 game when Mesa was pitching for the Phillies. And he hit him the next time the two faced each other, which was two Saturdays ago in Denver.

Mesa is now appealing a four-game suspension handed down by Bob Watson. I kid you not, Rockies GM Dan O’Dowd said on the Rockies radio pre-game show on April 29th that he was surprised Mesa was suspended, and that he didn’t think Mesa was throwing at Vizquel. I know GMs like to stand by their players, but really…

Putting the emotions and politics aside, of the more than 14,600 games that have been played since the beginning of the 2000 season, the April 23rd game marks the 138th time that four or more batters have been hit in the same game. Pondering that fact led me to take up the topic of hit batsmen in this week’s column.

A Pair of Trends
To lead off, it’s always good to have a historical perspective. In that vein, I offer the following graph that shows the number of hit batsmen per 1,000 plate appearances in both the American and National Leagues since 1901.



There are several interesting aspects to this graph that lead us to ask two primary questions.

First, you’ll notice that the number of hit batsmen has fluctuated fairly widely over time, with a high of 10.67 per 1,000 plate appearances in the American League in 2001 to a low of 2.82 in the American League in 1947. The rate at which batters were hit decreased steadily from the turn of last century through the late 1940s, and then increased for the next twenty years to a peak in 1968. It then decreased again until the early 1980s, but from 1985 it rose quickly through 2001, to a rate where it has since leveled off.

We humans love causal explanation for apparent trends like this, so the first question that comes to mind is: just what is it that can explain these changes over time?

Secondly, as you can see, batters have historically been hit at slightly different rates in the two leagues, with the American League seeing more hit batsmen from 1909 through 1928, and the National League then doing so until 1950. The leagues then traded the title back and forth until 1970 when the AL would lead for more than 20 years until the strike-shortened 1994 season. Since that time the back and forth has returned, with the AL leading seven times and the NL five. The second question then is: what are we to make of these differences between the leagues?

In the remainder of this week’s column we’ll tackle the first question related to the overall historical trends, and leave the second--which deals with league differences--for next week.

The Big Picture Trend
There have been a number of theories proposed attempting to explain the historical trends we see in the rate of hit batsmen. Let’s look at them.

On August 16, 1920 Carl Mays of the Yankees hit Ray Chapman of the Indians in the head with a pitch. The next day, Chapman died and became the only professional player ever fatally injured in a game. Although Mays was vilified in some quarters, dirty balls were also held responsible; as a result, umpires began to replace balls that had been dirtied much more often in-game.

At first reflection, any baseball fan might assume that this tragic event would have had an immediate impact on the way the game was played, with the result being that more pitchers were afraid to throw inside, which would reduce the number of hit batsmen. Additionally, fewer soiled balls in play would theoretically allow for their being spotted more easily by hitters, which might allow them to duck, dive, or dodge the inside pitch. In either case, we’ll call this the “physical hazard” theory to explain the reduction in hit batsmen.

While it’s a nice theory, you can see from the graph that the longer trend in the reduction of batters hit had been operative in the American League since 1911, and in the National League stretching all the way back to 1901. In fact, contrary to the theory that the Chapman beaning may have had a dampening effect, a closer examination of the period between 1919 and 1925 reveals that hit batsmen per 1,000 plate appearances actually briefly went up the year following the beaning (1921) through 1923, before resuming its downward trend.


AL NL
1919 6.80 6.28
1920 6.49 5.76
1921 6.76 5.12
1922 7.22 5.62
1923 7.35 5.62
1924 6.94 4.99
1925 5.67 4.90


So the physical hazard theory seems to have little validity. From this, one might then reason that if that monumental event didn’t signal a change then it’s unlikely that any other isolated incident or play would have, either.

So what about a broader theory that takes into account a cost/benefit valuation of hitting batters? For example, it could be the case that pitchers adjusted their frequency of hitting opposing batters based on their recognizing the costs of doing so. In times where runs are scarce, hitting a batter would cost relatively more than when runs are plentiful, since there is a greater probability that the batter would have been put out had they not been hit. The result is that there would be fewer hit batsmen in depressed offensive environments, and more in inflated environments. Sounds like a reasonable idea and we’ll dub it the “offensive context theory.”

We can test this theory by taking a look at the cost of hitting a batter in terms of the Win Expectancy Framework (WX) for both the American and National Leagues since 1901. The framework allows us to estimate how much a hit by pitch is worth in terms of wins and we can then graph the results for both leagues.



As you might have guessed, the increase in Win Expectancy for each hit batsman was high in the Deadball Era at over 3%, and then decreased from the early 1920s until the late 1930s as offensive levels rose, reaching a low point just over 2.6%. The values then began to climb again, reaching over 3% in the 1960s, and after a brief spike in 1989 fell as offensive levels rose again.

So, does the offensive context theory hold water? If you were to overlay these two graphs you would find little in common. For example, the rate of hit batsmen in the Deadball Era declined steadily, even though the cost remained fairly constant until the offensive explosion of 1920. Offensive levels then began to decline in the late 1930s, making the cost of hitting a batter rise, although we find that hit batsmen rates continued to decline into the late 1940s. And again, as the cost of hitting batters rose in the 1950s and from 1993 on, more batters were being hit. In fact, the WX value of a hit by pitch turns out to have almost zero correlation with the rate at which batters are hit. Another beautiful theory spoiled by some ugly facts.

Okay, offensive levels don’t seem to drive HBP rates, but what if an increased rate of hitting batters has the effect of depressing offense, and vice versa? We’ll label this the “intimidation theory.” After all, offensive levels rose as batters were being hit less often throughout the 1920s, and run-scoring dropped as batters were being hit more often in the 1960s. Many former players, especially those who had the “pleasure” of facing Don Drysdale and Bob Gibson, tend to favor this theory.

Unfortunately, the intimidation theory has the same underlying problem as the one that preceded it. While the examples cited in the previous paragraph seem to make sense, the theory fails to explain why hit batsmen declined throughout the Deadball Era, and why in the offensive eras of the 1950s and post-1993 the rate of hitting batters has actually increased.

Another theory that is popular, and one that we’ll tackle in next week’s column, is that since 1973 and the introduction of the designated hitter, hit batsman have been on the rise since the pitcher does not himself face the consequences of hitting opposing batters. This is the so called “moral hazard theory.” A quick glance at the first graph militates this idea, however, since the HBP rate actually began to decline in 1969, and continued to do so through the first eleven years of the DH. In addition, the rate rose and fell in both leagues, rather than affecting only the AL as you would expect.

A couple years ago, J.C. Bradbury of the excellent blog Sabernomics along with Doug Drinen studied the issue of HBP differences using play-by-play data. One of the conclusions they came to was that talent dilution as the result of the 1993 expansion draft contributed to the rise in hit batsmen post 1993. The theory is that a greater percentage of pitchers with less experience produce more accidental hit batsmen. At first glance this “expansion theory” makes a lot of sense. Take a look at the following table that lists each expansion event along with the rates the year prior to as well as the first year of the expansion.


Pre Post Diff
AL 1960 5.76 AL 1961 5.22 -0.54
NL 1961 5.48 NL 1962 6.11 +0.63
AL 1976 5.18 AL 1977 5.42 +0.24
NL 1992 5.48 NL 1993 6.66 +1.18
NL 1997 9.02 NL 1998 8.38 -0.64
AL 1997 7.78 AL 1998 8.77 +0.99


In all but two instances, the rate of hitting batters went up in the league to which baseball added teams. It should be noted that in the first four expansions the league that did not expand also saw their rate increase, which you might expect since expansion in one league also dilutes talent in the other.

What this table doesn’t show--though it's captured in the graph--is that the overall trends in each case were not really affected. When expansion came to the AL in 1961 and the NL in 1962 hit batsmen were already on the rise. When the AL expanded in 1977 the rates were declining and continued to do so after 1977. In both 1993 and 1998 the rates had already been increasing since 1985, and so while expansion may have egged on the increase, it clearly wasn’t the only factor. In other words, expansion did not signal a change in direction of trends that were already underway. As a result, it doesn’t appear that the expansion theory can be invoked as a general explanation and in any case can’t be invoked to shed any light on the trends prior to 1961 when both leagues had eight teams.

Finally, there have been articles in the popular press over the past few years that argue that a confluence of factors is responsible for the increasing rate at which batters are being brushed back. For example, a 2003 article from USA Today argued that a 2000 directive from Major League Baseball to change how umpires called strikes (in order to conform more closely to the rule-book definition) was the primary culprit. The “new strike zone theory” contends that adhering to the traditional definition has resulted in calling more strikes on the inside corner, and that pitchers are taking advantage of the fact, with hitters being plunked more often as they dive out over the plate in an attempt to hit what used to be strikes off the outside corner. Unfortunately for the new strike zone theory (at least as a single explanation), the increase in batters being plunked can be traced to almost 15 years before the “new” strike zone was implemented.

In addition, if you’re looking for single causes, one might imagine that the double-warning rule instituted in 1994 would have a dampening effect on hit batsmen. After a warning, pitchers might be wary of throwing at or near guys when they would almost certainly be ejected. However, although the rate went down slightly in 1994 in the AL, it did not in NL, and after that continued its upward trend.

Another factor mentioned in the article, however, appears to be more promising. First, the article speculates that a generation of pitchers accustomed to pitching to hitters with aluminum bats don’t go inside as often, since doing so is less effective when hitters can still fist a ball on their hands for a hit using a bat that doesn’t shatter. As a result of this “aluminum theory,” hitters have adjusted to looking for pitches over the outside corner, and therefore dive at the ball and stand closer to the plate. When this style of hitting is coupled with pitchers who, at the professional level, finally do try and pitch inside but do poorly at it, you end up with lots more batters being hit.

What is satisfying about this theory is that it accounts for the recent rise in HBP rates in both leagues and seems to have timing on its side. Although the first patent for a metal bat was granted in 1924, Worth didn’t introduce the first aluminum bat until 1970, and it wasn’t until the late 1970s that bats by Worth (and, especially, Easton) significantly increased the popularity of aluminum bats. Seeing the rates begin to climb five to ten years later would seem to therefore be in line.

Systemic Theories
In the end, theories like the aluminum bat theory are the kinds of systemic explanations that seem to be needed to explain shifts in the game such as those related to hit batsmen. Instead of looking for single incidents such as the physical hazard or strike zone theories, or very subtle causes like the offensive context or intimidation theories, what we should probably be looking for are systematic changes in how the game is played, changes that may even originate well before players reach the professional level. While I don’t have any immediate answers for the forty-year decline in the first part of last century, or the increase during the following twenty years, I think those lines of inquiry will prove to be more promising, and the theories they produce less likely to be the victim of a few inconvenient facts.

Tuesday, January 30, 2007

Plunking Explosion

Steve Treder over at The Hardball Times has a good article today on hit batsmen and how it has changed over time. He reviews many of the arguments that I discussed in my series last summer...

  • Schrodinger's Bat: Beautiful Theories and Ugly Facts

  • Schrodinger's Bat: Strike Zones, Trilobites, and a Vicious Cycle

  • Schrodinger's Bat: The Moral Hazards of the Hit Batsmen


  • As a conservative I appreciate Steves use of the Law of Unintended Consequences in positing that the introduction of batting helmets in the early 1960s and the adoption of the "zero-brushback-tolerance protocol" of the 1990s ironically may have contributed to the increasing rate at which hitters are plunked. This argument would also hold for the increased ability of players to wear body armor thereby leading to a kind of arms race in which hitters stand closer and pitchers try and back them off.

    It struck me, however, in light of my columns the past two weeks, that the increasing size of major league hitters may also play a role all on its own, especially in the past 30 years. Larger hitters would likely be less afraid of being hit and observational evidence tells me that hitters today do less to avoid being hit than did hitters in the past. This struck home to me as I watched the footage from the 1954 World Series the other night.

    I also noted that J.C. Bradbury mentions that he discusses how the distribution of talent in baseball has affected the HBP rate in his new book which should be available in mid-March. I'm looking forward to giving it a read.


    Update: Baseball Musings has a little post on this subject and two interesting graphs. To me, the first illustrates that the rate of HBP has affected both low and high ERA pitchers roughly equally over the course of baseball history although in the last few years it seems to have diverged. The second graph is an illustration of how inferior pitchers now pick up more innings than they did in the past. It should be cautioned, however, that the increasing ERA of the leagues as a whole will cause some of this as the sub 5.00 ERA group shrinks and the +5.00 ERA group grows. Historically the +5.00 group would be very small simply because pitchers with ERAs that high would be so far from the mean.