FREE hit counter and Internet traffic statistics from freestats.com
Showing posts with label Baseball History. Show all posts
Showing posts with label Baseball History. Show all posts

Sunday, November 15, 2009

One Worth Remembering

Still, for all the tumult, it was a hell of a life, being a baseball player and a hell of a time to be a baseball player, and for the rest of their lives, they understood what a privilege it had been.
- Mike Vaccaro from The First Fall Classic


Baseball’s 2009 postseason was certainly one to remember. Over the course of 28 days and through 30 games fans were treated to some excellent baseball (the Angels dominating the Red Sox), thrilling finishes (the two extra inning games of the Yankees/Angels series), heroic efforts (Alex Rodriguez, Chase Utley), and more than a little controversy (a handful of miscues by the men in blue). And in the increasingly crowded and competitive entertainment market, baseball did as well as it has in any recent season with television ratings higher than any since 2004, in no small part due to the presence of two teams from Los Angeles and one from New York among the final four left standing.

But as good as this most recent version was, Mike Vaccaro in his new book, The First Fall Classic: The Red Sox, The Giants, and the Cast of Players, Pugs, and Politicos Who Reinvented the World Series in 1912, reminds us of two central facts of baseball history while providing an antidote to a still too common, but preciously incorrect, sentiment regarding baseball’s past.

Regarding the former, Vaccaro’s tome aptly reminds us that thrilling finishes and controversy in October (and now November) are more than a century old, and through much of the 20th century baseball held the preeminent place in the nation’s professional sporting life.

As for the remedy, for those who still hang on to the picture of baseball’s early history as simple and even innocent as compared to dim view of modern “baseball as a business” in a world of free agency, arbitration, PEDs, and the intense media spotlight, Vaccaro provides an education in how thoroughly professionalized the game was in every sense of the word. From the intense controversy over the size and source of the player’s share of the gate, to the newspaper columns that players (some of whom were involved in the series such as Christy Mathewson) penned to monetize their position, to the large sums of money that exchanged hands in bets both outside and inside the ballpark by fans, managers including John McGraw who wagered $500 on his team (to win of course), and players alike, and even extending to the interference of an owner in an attempt to extend the series thereby increasing his profits, there is very little innocence and simplicity to be found. Perhaps Fred Snodgrass, Giants center fielder in the series and the man who would be bludgeoned with the business end of the sport for the rest of his life, would say fifty years later, “We were professionals. And professionals get paid”.

For those who aren’t familiar with the details of the series I’ll leave you to discover just how the elements of civic pride and rivalry, religion and the rift it caused in one of the clubhouses, and yes, the monetary interests of everyone involved all coalesce to make those nine days (that’s right, nine days in which eight games were played with a mostly alternating schedule – and you think modern travel is tough?) in October 1912 so interesting in Vaccaro’s retelling. One mark of the greatness of the series, as Vaccaro points out in the introduction is simply this: prior to 1912 baseball's postseason was known as the "world series" while after it was the "World Series".

While the book focuses on the series itself, similar to Cait Murphy’s excellent Crazy ’08: How a Cast of Cranks, Rogues, Boneheads, and Magnets Created the Greatest Year in Baseball History, the author does a good job of setting the scene of the times by primarily following two concurrent events; the attempted assassination of “Bull Moose” presidential candidate Theodore Roosevelt in Milwaukee on the day of game six, and the murder trial of New York City police officer Charles Becker in which Becker stood accused of ordering the murder of the gambler Herman Rosenthal. Suffice it to say the story ends well for one and not the other. Although Murphy’s book is a little broader in its social and historical context (not to mention its main subject matter as it deals with the entire season and not just the World Series) Vaccaro adds enough of the back stories of the players and coaches (particularly Mathewson, McGraw, Tris Speaker, Snodgrass, and “Smokey” Joe Wood), as well as the owners and “politicos” (Boston Mayor John “Honey Fritz” Fitzgerald, grandfather of JFK) to paint a picture of what it was like to live in that time. In addition, he focuses on the crowds gathered at the many sites, particularly throughout New York, where the rabid throngs could assemble and “watch” the progress of the game with as little as a six second delay through a mechanical version of MLBAM’s GameDay - a recreation sometimes complete with high tech lights and buzzers and moving figures. During the series baseball crazy fans in the tens of thousands without tickets to the Polo Grounds or the gleaming new Fenway Park would make their way to sites hosted by ten daily newspapers in New York and four in Boston to take in the action. Short vignettes of a few of those regular folks serve to complete the picture.

The book is structured chronologically leading off with a recap of 1912 regular season where the Red Sox, led by Wood’s 34-5 record and 344 IP, dominated the American League with a record of 102-50 and outpacing the Senators by 14 games. The “Speed Boys” as they were known despite stealing “only” 185 bases (the Senators led with 274) and leading the league in homeruns with 29, led the league in runs scored and fewest runs allowed. In the National League the Giants, while not as dominant, won 103 games and finally shook off both the Pirates and Cubs in the final months of the season. For the Giants it was their second consecutive appearance in the series having lost to Connie Mack’s Athletics in six games in 1911 while the Red Sox were returning for the first time since the inaugural series played in 1903. From there the book offers a chapter for each of the eight games and an extra for the final day of the series along with a well done epilogue that wraps up the few loose ends that remain and provides a platform for a poignant commentary on how we remember our heroes. Perhaps the only nitpick with the presentation is the lack of an index, a feature most readers with baseball libraries would I’m sure find useful.

If The First Fall Classic isn’t quite as comprehensive in its historical context as Murphy’s or certainly Josh Prager’s The Echoing Green, in terms of the style Vaccaro hits a homerun in his descriptions of the action on the field. He notes in the introduction that the bulk of his research was in reading contemporary news accounts and so given the colorful language and detailed play by play that newspapers of that time produced, perhaps some of that rubbed off as I found that the sense of drama and reality he portrayed was simply first rate. While prior to reading I had a summary knowledge of the series, although not all the particulars to be sure, several chapters had that “page-turner” quality that are inherent in well written books. If for no other reason than its readability and drama I'd recommend this book. Of course when coupled with the other reasons noted at the beginning of this review, picking up this book is simply a no-brainer for anyone with an interest in a truer picture of the game's past.


As a quick aside, as always books like this interest me because of their ability to contrast, knowingly or unknowingly, how the game was played in the past with its modern version. And of course those that describe play in the dead ball era, when the art of playing for one run was king and defenses weren't particularly efficient in turning batted balls into outs, are even more appealing. To that end, a crucial baserunning gaffe, or should I say base coaching gaffe, by one of the teams in game three grabbed my attention since it came at an interesting time in the evolution of the practice of employing full-time coaches. While the first fill time coach, Arlie Latham, was hired by the Reds in 1900, it would be 20 years before the practice was widely accepted. Not surprisingly it was McGraw who hired both Latham and Duke Farrell as full-time coaches in 1909 while other teams used either their manager or a rotating set of players. Although the gaffe from game three might have occurred if there were a full-time coach there or not, it illustrates how far specialization in the game has come and with it a higher overall level of play.

Saturday, September 20, 2008

As Time Goes By

Today we'll run another tidbit from the errata of It Ain't Over 'Til It's Over: The Baseball Prospectus Pennant Race Book...




During which decade did baseball fans enjoy the best pennant races? Bill James, in The New Bill James Historical Abstract says unequivocally that the 1940s was "The Best Decade Ever for Pennant Races". Our compilation of Race Score by decade agrees.


Decade Aggregate Races Avg
1900s 375.5 12 31.3
1910s 234.9 9 26.1
1920s 423.6 12 35.3
1930s 226.1 13 17.4
1940s 390.9 11 35.5
1950s 371.5 12 31.0
1960s 385.1 12 32.1
1970s 310.0 17 18.2
1980s 354.1 20 17.7
1990s 247.0 16 15.4
2000s 431.5 27 16.0


The 1940s pulled out the highest average Race Score although the 1920s came in a close second and actually included one more race. Interestingly, the 1930s included 13 races, the highest percentage at 65% of any decade, and nine of those were in the NL with only 1931 excluded. However, many of the races were only marginal with the 1934 NL race won by the Cardinals on the strength of a 33-12 record down the stretch taking the highest score at 29.9 and ranking 43rd.

James ranks the races of the 1940s and so here is his list alongside our ranking.


James Year Lg Score Rank Teams Winner
1 1940 AL 46.0 16 3 Detroit Tigers (90-64)
2 1944 AL 21.0 80 2 St. Louis Browns (89-65)
3 1948 AL 72.0 3 3 Cleveland Indians (97-58)
4 1946 NL 32.9 35 2 St. Louis Cardinals (98-58)
5 1949 NL 37.0 30 2 Brooklyn Dodgers (97-57)
6 1949 AL 37.0 29 2 New York Yankees (97-57)
7 1942 NL 50.9 15 2 St. Louis Cardinals (106-48)
8 1941 NL 36.5 31 2 Brooklyn Dodgers (100-54)
9 1945 AL 18.0 86 2 Detroit Tigers (88-65)
10 1945 NL 29.9 43 2 Chicago Cubs (98-56)
1947 NL 9.8 2 Brooklyn Dodgers (94-60)


The only race from the 1940s that James doesn't include is the 1947 NL race which ranks 121st on our list in which the Dodgers overcame the Braves at midseason and held off the Cardinals, winning by a margin of five games. As James notes, the NL races of the 1940s were dominated by the Dodgers and Cardinals while in the AL the races were more diverse.

This compilation by decade also reinforces the notion that modern races garner lower race scores overall as not only the average Race Score has declined but also the number of races that have positive scores has fallen from around 57% before divisional play to 44% after.

But since our Race Score gives extra weight to races with multiple teams with good records, this trend can also be attributed to an increasing competitive balance over time. As shown in the graph below for the AL from 1901-2005, the standard deviation in winning percentage has noticeably declined over time (albeit with a number of bumps along the way and a small upturn in the past five years) as the dotted linear trend line indicates. As more teams are bunched closer together, it is statistically less likely that two or more teams will break away from the pack and therefore score very highly in the Race Score metric.



Just why competitive balance has generally increased with time is another story. The most accepted notion, popularized by the late paleontologist and baseball fan Stephen Jay Gould in the context of the disappearance of the .400 hitter , wrests upon two pillars. First, as knowledge about how to play the game has improved and become standardized it has become more difficult for players and hence teams to take advantage of their less skilled competitors. Second, the general level of play has increased due to better athletes produced through a larger population from which the best players are chosen, better diet and training, and better technology, all of which moves the game closer to the limits of human ability providing less space for variation. In the end that leaves great players and great teams, in Gould's words*, less "space for taking advantage of the suboptimality of others".


* The 1996 book Full House: The Spread of Excellence from Plato to Darwin by Stephen Jay Gould contains an extended discussion of Gould's argument. Also see my column "Schrodinger's Bat: The Myth of the Golden Age".

Friday, July 04, 2008

Like Peas in a Pod

More outtakes from The Great Pennant Race Abstract...



The 1950 NL race (ranked 54th) is certainly the more famous of the two races in 1950. That season the Phillies jumped out to a big lead and still held leads of 9 game lead over the Dodgers and 7.5 games over the Boston Braves as late as the morning of September 19th. The Dodgers roared back winning 13 of 16 while the Phiilies won just 3 of 12 to put the Dodgers one game back with one to play on October 1st. Tied at one into the tenth, Dick Sisler hit a three-run homer off of a tiring Don Newcombe (Sisler hit Newcombe's 127th pitch of the afternoon) to defeat Brooklyn 4-1 and finally secure the pennant for the Phillies.

As great as that race was, the AL race of 1950 takes the second spot in our rankings. This is the case since the Yankees, Tigers, Red Sox, and Indians were all very good teams and all in the race at the beginning of September. All four teams would win 92 or more games and finish within six games of each other. The Yankees were helped by bringing up rookie southpaw Whitey Ford in late June (9-1, 2.81 ERA in 112 IP) with Joe DiMaggio making a late season comeback. By contrast the Tigers were hurt by the injury to Virgil Trucks and the Red Sox by the fractured elbow of Ted Williams sustained in the All-Star game while the Indians were swept in a September series by the lowly St. Louis Browns to knock them out of the race.

As with 1950, the 1964 NL race (Race #5 ranked 7th) is the more famous of the two races for that season but the 1964 AL race ranks just above it at number six. That race featured three teams with 97 or more wins including the Yankees (their last pennant until 1977), White Sox, and Orioles all of whom finished within two games of one another.

1964 was Yogi Berra's lone season as Yankee skipper in the 1960s (he would also manage the team in 1984 and the beginning of 1985) and the Bronx Bombers found themselves un-customarily struggling, four and half games out on August 29th and trailing both the Sox and Orioles. Then they caught fire. Most attribute the turnaround of the Yanks to the famous "harmonica incident" where utility infielder Phil Linz, "assisted" by Mickey Mantle, was reprimanded and fined by Berra for playing the harmonica on the team bus following a four game sweep at the hands of the White Sox on August 20th. While that makes for a good story, it should be noted that following the incident the Yanks immediately dropped two games to the Red Sox and won just 7 of their next 13 before reeling off 23 wins in their final 30 games (and an 11-game winning streak from September 16-26) to finish a game ahead of the White Sox and take the pennant*. No, the turnaround can more likely be attributed to the recall of Mel Stottlemeyre in August who would go on to win 9 games, and the purchase of Pedro Ramos from the Indians to shore up the bullpen on September 5th who would pitch 21.7 innings giving up 13 hits while striking out 21 and walking not a batter down the stretch.

The natural corollary to the stories of 1950 and 1964 is to rank the years with the greatest total Race Scores and so here are the top 20 seasons where it could be argued that baseball fans enjoyed the best pennant races.


Rank Year Races Score
1 1908 2 142.7
2 1964 2 132.7
3 1950 2 104.8
4 1928 2 90.2
5 1915 2 84.0
6 1980 3 79.8
7 1916 2 78.8
8 1977 2 78.4
9 1924 2 76.7
10 2004 3 76.5
11 1962 2 76.3
12 1985 4 75.1
13 1949 2 74.0
14 1948 1 72.0
15 1920 1 71.2
16 1978 3 71.2
17 1982 4 70.8
18 2007 4 69.4
19 1993 2 62.9
20 1909 2 62.2


Special mention should be made here of 1981 whose eight "races" totaled a score of 71.2 which would have tied for 17th place. The first half races scored a 41.5 while the second half was at 29.7. The best of those was the first half AL West which placed 101st overall and which saw the A's finish 1.5 games ahead of the Rangers and two and half over the White Sox. Of course, neither the fans nor the players understood that the games completed before the strike would have such consequences on the postseason and so it is difficult to construe these as true races.

1908 takes the top spot as the less famous AL race takes 13th in our rankings. Detroit, Cleveland, and Chicago battled it out and finished within a game and half of each other. The Naps (as the franchise was then known in honor of their player-manager Nap LaJoie) won 16 of 18 to edge in front of Detroit in late September punctuated by Addie Joss's perfect game on October 2nd against the White Sox whose hurler Ed Walsh himself struck out 15. The Tigers, however, would take the pennant by a half game on the final day with a win over Chicago. A controversy ensued because the Tigers were not required to make up a rainout causing the powers that be to establish a new rule requiring all ties and rainouts affecting a pennant race to be replayed.

Well, sort of.

The 1938 season was interrupted for several days in the wake of the strongest hurricane to hit New England in recorded history and that took an estimated 600 lives. Perhaps coincidentally or perhaps not, after play resumed on September 22nd the Cubs went on to win ten in a row on their way to the NL pennant (discussed below). What is not coincidental, however, is that on September 18th the approaching hurricane caused both the Cubs and Pirates to play tie games. Due to the hurricane the games were not able to be replayed and under the rules of the time the games were not allowed to be played after the last scheduled game of the season. The rule was changed in 1951 in the AL and 1955 in the NL making 1938 the last season in which un-played games affected the outcome of a race.

Of interest here as well is the 1915 season in which the Federal League race (ranked 21st) edges out the AL race (ranked 22nd) 42.5 to 41.5 but that together rate the season as the 5th best. In the AL the Red Sox won 101 games by the pitching prowess of Babe Ruth and Smokey Joe Wood and edged out the Tigers by 2.5 games who themselves won 100 times. But in the Federal League something happened that had never happened before and didn't happen again until 2001 – the two teams at the top finished in a tie by the traditional method of measuring games behind.


Team Name G W L T PCT GB RS RA
Chicago Whales 155 86 66 3 0.5658 - 640 538
St. Louis Terriers 159 87 67 5 0.5649 - 633 527
Pittsburgh Rebels 156 86 67 3 0.5621 0.5 592 524


The Whales, led by player-manager by Joe Tinker, edged out the St. Louis Terriers and aging star pitcher Eddie Plank by .0009 as the winner was decided on percentage points since the league did not have a rule for the playing of tie breakers. 1915 was the second and final season of the Federal League as a settlement ensued whereby the Federal League owners of the Chicago and St. Louis franchises purchased the Cubs and Browns with the happy result that what would become Wrigley Field was brought into the NL.

In 2001 the NL Central (ranked 67th) duplicated the feat of the Federal League when the Astros and Cardinals finished with identical 93-69 records. Of course, the addition of the Wild Card in 1995 has typically made the playing of tie breakers unnecessary although of course the tie-breaker between the Rockies and Padres last season for the Wild Card was a great end to a season which saw that 2007 NL West battle rank 36th (32.7). That unhappy result was duplicated in both the 2005 AL East (ranked 51st) and the 2006 NL West (ranked 107th).

Since divisional play began in 1969 the best overall set of races can be said to be 1985 where all four races earned Race Scores greater than zero. In particular the AL East (ranked 45th) and the NL East (ranked 53rd) were excellent. In the AL East the Blue Jays, led by their outfield of Jesse Barfield, Lloyd Moseby, and George Bell, captured their first flag winning 99 games and edging out the Yankees by beating them on the season's penultimate day 5-1. The AL West race was no slouch either as the Royals slipped past the Angels by winning three of four head-to-head matchups in the season's final weekend. In the NL East, the Cardinals edged the Mets by three games on the strength of a running attack that featured 314 stolen bases. In a 2005 article yours truly calculated that the version of "Whitey Ball" employed in 1985 contributed just over 30 runs to the Cardinals offense, a total that translates to about three wins and exactly their margin over the Mets.


* The White Sox eventually finished second on the strength of their pitching and the Orioles third on the performance of MVP Brooks Robinson but both teams were hurt by losses to poor teams down the stretch. The Sox dropped five of seven in one stretch to Washington, Cleveland, and Minnesota and the Orioles split a four game set with Kansas City and two of three to Minnesota in the final weeks.

Sunday, June 29, 2008

Ranking the Races

This post continues The Great Pennant Race Abstract series started several weeks back and references the races discussed in the book It Ain't Over 'Til It's Over: The Baseball Prospectus Pennant Race Book.



Using the definition in the introduction there have been 312 races (counting the four 1981 races twice because of the split season's two halves as well as the Federal League's two races) beginning with the 1901 season. Not all of them, or for that matter a majority of them, have resulted in the kind of drama and excitement discussed in many of the chapters of this book. And it's difficult if not impossible to quantify what makes up a great race but of course that's exactly our task here. I'll kick off this abstract by ranking the top 100 pennant races of all time.

Analyst Jim Albright, who over the years has been the most prolific analyst of Japanese baseball and writes for BaseballGuru.com, once developed a system for ranking the greatest Japanese pennant races. With a few tweaks, that's the system employed here.

Simply put and following Albright's lead, a great pennant race can be defined as one that contains three components; 1) it is close, 2) it is between good teams, and 3) the more teams involved the greater the excitement.

The first component speaks for itself but the second is a bit more controversial. While some may argue that the 2006 NL Central race was a great race, the limping Cardinals losing seven in a row from September 20-26, almost blowing a 7 game lead with 13 to play, and especially winning the division with just 83 wins, starts to take on a more comical look than it does one characterized by great baseball. The Cardinals did go on take the distinction of the team with the fewest wins and winning percentage (.516) to ever win the World Series (the 1987 Twins at 85 wins and a .525 winning percentage were next) but their regular season race just doesn't rise to aesthetic level of a "Great Pennant Race". The same can be said for the 1973 National League East race as discussed in Race #8 where three teams finished within three and half games and another at give game behind the Mets but where none of those teams finished above .500. The 1984 AL West race detailed in Race #9 is yet another that falls into this category and the list goes on.

The third component should also not engender much controversy as it's obvious that a five team scramble as in the 1964 NL race or the four team jostle in the AL in 1950 does much to add to the drama as the number of what if scenarios and outcomes multiplies.

The methodology therefore comprises three simple steps to calculate a "Race Score" where the higher the score the greater the pennant race.

  • First, subtract the number of losses from the number of wins for each team in the race (excluding the winning team). Teams with better records will record higher numbers consistent with our first component mentioned above. Although this technique does not capture the dynamic nature of the race it turns out that the position at which teams end is arguably the best determiner of the "closeness" of the race. Using a more dynamic approach ranks races where teams hung around within striking distance but never really challenged the front-runner more highly but don't do as well with races where a furious August or September comeback brings a team back into contention.


  • Second, raise the number of games behind each team finished to the power of 1.65 and subtract it from the result of the first step. This has the effect of combining our first and second components since teams with better records who finish fewer games behind will receive higher scores. A team like the 2006 Astros who finished 1.5 games behind with a record of 82-80 receives a score of 0.048 while the 1927 Cardinals finishing an equal distance behind but with 92 wins receives a score of 29.05 (Albright originally squared the number of games behind but I found that raising it to slightly lower power allows us to consider more races and be a little more forgiving with regard to the second component).


  • Next, the teams with negative scores are eliminated and the totals summed up for each race.


  • Finally, add a bonus for the number of teams (excluding the winning team) in the race. For one team simply multiply the Race Score by 1, for two teams give a 10% bonus and multiply by 1.1, for three teams its 20% at 1.2, four teams 30%, and so on (Originally Albright gave bonuses in increments of 20%, 40%, 60% etc. but this pushed multi-team races too far to the top for my taste since good races like the 1942 NL race between the Dodgers and Cardinals and the 1993 NL West race between the Dodgers and Giants would otherwise fall precipitously in the rankings).


  • What that leaves us with are 161 of the 312 races or just over 50% that garner a positive Race Score. Without further ado then, here are the top 100 pennant races of all time with those discussed in detail in this book both bolded and italicized.


    Rank Year Lg Div Score Teams Winner
    1 1908 NL 90.2 3 Chicago Cubs (99-55)
    2 1950 AL 77.8 4 New York Yankees (98-56)
    3 1948 AL 72.0 3 Cleveland Indians (97-58)
    4 1920 AL 71.2 3 Cleveland Indians (98-56)
    5 1962 NL 70.5 3 San Francisco Giants (103-62)
    6 1964 AL 68.0 3 New York Yankees (99-63)
    7 1964 NL 64.6 4 St. Louis Cardinals (93-69)

    8 1977 AL East 62.6 3 New York Yankees (100-62)
    9 1927 NL 61.5 3 Pittsburgh Pirates (94-60)
    10 1956 NL 59.2 3 Brooklyn Dodgers (93-61)
    11 1967 AL 57.4 4 Boston Red Sox (92-70)
    12 1924 NL 53.8 3 New York Giants (93-60)
    13 1908 AL 52.5 3 Detroit Tigers (90-63)
    14 1928 NL 51.7 3 St. Louis Cardinals (95-59)
    15 1942 NL 50.9 2 St. Louis Cardinals (106-48)
    16 1940 AL 46.0 3 Detroit Tigers (90-64)
    17 1916 NL 44.7 3 Brooklyn Robins (94-60)
    18 1955 AL 43.6 3 New York Yankees (96-58)
    19 1993 NL West 43.0 2 Atlanta Braves (104-58)
    20 1966 NL 42.8 3 Los Angeles Dodgers (95-67)
    21 1915 FL 42.5 3 Chicago Whales (86-66)
    22 1915 AL 41.5 2 Boston Red Sox (101-50)
    23 1988 AL East 41.4 5 Boston Red Sox (89-73)
    24 1978 AL East 39.7 3 New York Yankees (100-63)
    25 1904 AL 39.4 3 Boston Americans (95-59)
    26 1907 AL 38.9 3 Detroit Tigers (92-58)
    27 1928 AL 38.5 2 New York Yankees (101-53)
    28 1906 AL 37.0 3 Chicago White Sox (93-58)
    29 1949 AL 37.0 2 New York Yankees (97-57)
    30 1949 NL 37.0 2 Brooklyn Dodgers (97-57)
    31 1941 NL 36.5 2 Brooklyn Dodgers (100-54)
    32 1951 NL 36.0 2 New York Giants (98-59)
    33 1916 AL 34.1 3 Boston Red Sox (91-63)
    34 1909 NL 33.1 2 Pittsburgh Pirates (110-42)
    35 1946 NL 32.9 2 St. Louis Cardinals (98-58)
    36 2007 NL West 32.7 3 Arizona Diamondbacks (90-72)
    37 1980 AL East 31.9 2 New York Yankees (103-59)
    38 2004 AL West 31.8 3 Anaheim Angels (92-70)
    39 1930 NL 31.5 3 St. Louis Cardinals (92-62)
    40 1922 AL 31.0 2 New York Yankees (94-60)
    41 1980 NL West 30.9 3 Houston Astros (93-70)
    42 2002 NL West 30.0 3 Arizona Diamondbacks (98-64)
    43 1945 NL 29.9 2 Chicago Cubs (98-56)
    44 1934 NL 29.9 2 St. Louis Cardinals (95-58)
    45 1985 AL East 29.9 2 Toronto Blue Jays (99-62)
    46 1909 AL 29.1 2 Detroit Tigers (98-54)
    47 1905 AL 28.9 2 Philadelphia Athletics (92-56)
    48 1952 AL 28.9 2 New York Yankees (95-59)
    49 1987 NL East 28.6 3 St. Louis Cardinals (95-67)
    50 1935 NL 28.2 2 Chicago Cubs (100-54)
    51 2005 AL East 28.0 2 New York Yankees (95-67)
    52 2004 AL East 27.9 2 New York Yankees (101-61)
    53 1985 NL East 27.9 2 St. Louis Cardinals (101-61)
    54 1950 NL 27.1 3 Philadelphia Phillies (91-63)
    55 1999 NL Central 27.0 2 Houston Astros (97-65)
    56 2006 AL Central 27.0 2 Minnesota Twins (96-66)
    57 1997 AL East 26.9 2 Baltimore Orioles (98-64)
    58 1987 AL East 26.9 2 Detroit Tigers (98-64)
    59 1979 NL East 26.9 2 Pittsburgh Pirates (98-64)
    60 2002 AL West 26.2 2 Oakland Athletics (103-59)
    61 1937 NL 25.9 2 New York Giants (95-57)
    62 2000 NL East 25.0 2 Atlanta Braves (95-67)
    63 1982 AL East 25.0 2 Milwaukee Brewers (95-67)
    64 1965 NL 24.9 2 Los Angeles Dodgers (97-65)
    65 1974 NL West 24.2 2 Los Angeles Dodgers (102-60)
    66 1982 NL West 24.0 3 Atlanta Braves (89-73)
    67 2001 NL Central 24.0 2 Houston Astros (93-69)
    68 1991 NL West 23.0 2 Atlanta Braves (94-68)
    69 1935 AL 22.9 2 Detroit Tigers (93-58)
    70 1924 AL 22.9 2 Washington Senators (92-62)
    71 2007 AL East 22.9 2 Boston Red Sox (96-66)
    72 1938 NL 22.7 3 Chicago Cubs (89-63)
    73 1918 AL 22.7 3 Boston Red Sox (75-51)
    74 1914 FL 22.1 3 Indianapolis Hoosiers (88-65)
    75 1921 AL 22.0 2 New York Yankees (98-55)
    76 1926 NL 21.9 3 St. Louis Cardinals (89-65)
    77 1919 AL 21.1 2 Chicago White Sox (88-52)
    78 1973 NL West 21.1 2 Cincinnati Reds (99-63)
    79 1954 AL 21.1 2 Cleveland Indians (111-43)
    80 1944 AL 21.0 2 St. Louis Browns (89-65)
    81 1993 NL East 19.9 2 Philadelphia Phillies (97-65)
    82 1969 NL West 19.8 3 Atlanta Braves (93-69)
    83 2000 AL West 19.7 2 Oakland Athletics (91-70)
    84 1939 NL 19.0 2 Cincinnati Reds (97-57)
    85 1978 NL West 18.5 2 Los Angeles Dodgers (95-67)
    86 1945 AL 18.0 2 Detroit Tigers (88-65)
    87 1952 NL 18.0 2 Brooklyn Dodgers (96-57)
    88 2003 AL West 17.9 2 Oakland Athletics (96-66)
    89 1951 AL 17.8 2 New York Yankees (98-56)
    90 1921 NL 17.2 2 New York Giants (94-59)
    91 1980 NL East 17.0 2 Philadelphia Phillies (91-71)
    92 1985 AL West 17.0 2 Kansas City Royals (91-71)
    93 1996 NL West 17.0 2 San Diego Padres (91-71)
    94 2004 NL West 16.9 2 Los Angeles Dodgers (93-69)
    95 1959 NL 16.5 3 Los Angeles Dodgers (88-68)
    96 1999 AL East 16.2 2 New York Yankees (98-64)
    97 1923 NL 16.0 2 New York Giants (95-58)
    98 1926 AL 15.9 2 New York Yankees (91-63)
    99 1954 NL 15.8 2 New York Giants (97-57)
    100 1977 NL East 15.8 2 Philadelphia Phillies (101-61)


    Nine of the thirteen races discussed in this book make the top 100 with the 1908 race (Race #4) taking the top spot by a fairly wide margin and three others finishing in the top eleven. The 1972 AL East (Race #7) finished 103rd, the 2003 NL Central (Race #6) placed 104th. That leaves only the 1973 NL East and 1984 AL West completely out of the 157 races that finished with positive Race Scores. In case you're wondering, the 2006 NL Central captured the 161st and final spot with a Race Score that rounds to 0.0.

    Some of you will no doubt quibble with this list and indeed some may detect a chronological bias which will be discussed later. Regardless of the methodology no list would be perfect and this list is offered more as a secondary look than as a definitive ranking. Arguing passionately about the minutiae of the game is one of the many aspects of baseball that we love as fans. Let the debate begin.

    Wednesday, June 18, 2008

    The Great Pennant Race Abstract


    Yes, it's a little early to get all excited about the coming pennant races but this is a topic I've meaning to visit ever since the Baseball Prospectus book, It Ain't Over 'Til It's Over: The Baseball Prospectus Pennant Race Book, came out in paperback a few months ago. In any case, I contributed to that book in the appendix titled "The Great Pennant Race Abstract" by creating a series of graphs that highlighted each of the thirteen pennant races that were discussed in the chapters of the book.

    The original vision for the abstract was a little more grand and included a series of mini-essays highlighting aspects of other pennant races not discussed in detail in the book. While I did in fact pen that longer version of the abstract that stretched to over 12,000 words, it couldn't be acccomdated in the book. So for the next few days I'll publish those mini-essays here beginning today with the introduction to the abstract. These are as they were originally written with the exception of updating them to include the 2007 season. Hopefully, you'll find them entertaining and it will spur you to check out the book if you haven't already. As is the case with the other books published by BP, this one combines good baseball writing with the kind of analysis you typically read in the work of Nate Silver, Joe Sheehan, Christina Kahrl et. al. over on the web site.

    As for myself, I'm a little biased to his turn of mind I suppose but Silver's chapter on the 1944 American League race featuring the St. Louis Browns ("The Home Front") is probably my favorite as it combines the narrative of the Brown's first and only AL pennant with the effect of the war on baseball and ending with a counterfactural 1944 race based on an estimate of how much talent each team lost and how it was replaced (hint: the Brown got off relatively scot-free enabling them to take the crown).

    So without further ado, here's the introduction of the Great Pennant Race Abstract...



    Historian Jules Tygiel has argued that the men who shaped baseball in the 1850s and 1860s fashioned it in their own image through the embrace of the "modern, rational, scientific, worldview that had grown prevalent in mid-nineteenth century America."* Consistent with that world view the chaos of various versions of "town ball" were replaced by the fixed boundaries of field, team size, and game length as baseball exploded in popularity immediately before and after the Civil War.

    Embedded in that desire for rationalization was the felt need to faithfully record the events of the game, hence the first box score, then termed an "abstract", appearing in the New York Morning News on October 22, 1845. From those humble beginnings quantification took root and with the pioneering Henry Chadwick leading the way, baseball and numbers were forever intertwined.

    Such is our legacy as baseball fans.

    That legacy has been exercised, some would say with a vengeance, again and again throughout this book. Our authors have taken you on a journey through the ins and outs of thirteen of the greatest pennant races in the history of baseball. These were selected using Clay Davenport's methodology described in the introduction. But the mind of the baseball fan, obsessed as it is with quantification, probably won't rest there. Is there an alternate way to rank the races? What about all the races that didn't make the list of thirteen? How do they stack up? What do the distribution of great races look like over time? What are their numeric oddities and highlights?

    Look no further for in this abstract I'll present a series of topics brimming with analysis and information nuggets to satiate you the reader and fan. Each mini-essay touches on a theme embedded in one or more pennant races, which for our purposes here are defined as the American, National, and Federal League regular season races (including tie-breakers) beginning in 1901 and extending through the divisional races (thereby also termed pennant races) of 2007 and excluding 1994 where no post season teams were named and hence where there could be said to have been no race. Enjoy.


    * Past Time: Baseball As History by Jules Tygiel. Oxford University Press, New York Date Published: 2000 ISBN: 0195089588

    Sunday, June 08, 2008

    Crazy for Crazy '08


    ”So grandly contested were both [pennant races], so great the excitement, so tense the interest, that in the last month of the season the entire nation became absorbed in the thrilling and nerve racking struggle, and even the Presidential campaign was almost completely overshadowed.”Sporting Life, October 17, 1908

    Before my attention and allegiance shifted due to recent and happy events, I was very pleased to receive Cait Murphy’s Crazy ’08: How a Cast of Cranks, Rogues, Boneheads, and Magnets Created the Greatest Year in Baseball History as a Christmas present. Of course as a lifelong Cubs fan my main interest was in reliving and hopefully foreshadowing a time when, in the words of one Washington sportswriter of the time, they “were grizzlies these Cubs, Ursine Colossi who towered high and frowningly and refused to reckon on anything but victory.” And for Cubs fans perhaps there is something special in the symmetry of the centennial of the Cubs last World Series victory as this year’s edition took the league’s best record into June – a feat that more than one source reminds us was last accomplished by the franchise in yes, you guessed it, 1908. It remains to be seen however, whether Lou Pinella’s Cubs will be able to say as 1908’s manager Frank Chance (known at the time as the “Peerless Leader” or simply “P.L.” for short) did, with that air of arrogance and without sounding ridiculous, “Who ever heard of the Cubs losing a game they had to have?”

    But even with my attention somewhat diverted, I shouldn’t have been surprised that in this book Murphy, an assistant managing editor at Fortune magazine, goes so far beyond the Cubs, the Merkle game and its aftermath, that any baseball fan or even history buff, will find it entertaining and a joy to read. Although the book focuses on the National League race it should not be forgotten that the American League race was almost its equal and Murphy devotes a chapter (“That Other Race”) to it as well.

    The book follows a mostly chronological course beginning with the events of the 1907-1908 offseason. From the now all-too-familiar inaction in the face of the growing problem of gambling to moves like the St. Louis Browns signing the enigmatic southpaw Rube Waddell to rules changes including the sabermetrically questionable adoption of the modern sacrifice fly rule, and a rule prohibiting pitchers from soiling one of the half dozen or so new balls that enter play each game, Murphy does a fine job of providing context to the season and the times by periodically recalling events from the recent past.

    From a baseball perspective her description of the playing conditions in the chapter “The Hot Stove League” is excellent by recapping the evolution of the game on the field in all three primary dimensions and generating one of my favorite lines in the discussion on defense where Murphy quite correctly notes that baseball “is Darwinian in its results but Newtonian in its processes.” Those Darwinian processes, already well established in 1908 and applying their mode of selection, led to the development of relief pitchers, pinch hitters and runners, base coaches, platooning, defensive positioning and strategies, and much more. What accompanied them was a march towards standardization that worked together to contribute to a gradual perfecting of the craft of baseball that we modern fans are the happy beneficiaries of. In the end, she concludes that while there are many things the modern fan (“crank” or “bug” as they were called then) would find strange including whiskey in the stands and the occasional player smoking on the field, the game in 1908 would be entirely recognizable (hot dogs and “Take Me Out to the Ballgame” which made its debut in 1908 to name a couple) in a way that other major sports with shorter pedigrees would not be. At the same time she argues that although in 1908 baseball is already big business and commands an air of respectability that it lacked just a few years before, the 1908 season – with the Merkle game and its aftermath including riots, legal wrangling and at least one death, acting as a catalyst – is when “baseball itself makes its turn into the modern era.” One sign of this new era is that 1908 was the final season for Pittsburgh’s Exposition Park (the site of which sits just east of present PNC Park on the banks of the Allegheny) and Philadelphia’s Baker Bowl, the former being replaced by Forbes Field and the latter by Shibe Park, the first fireproof park made of steel and concrete and built in French Renaissance style for a cool $457,000. Other owners were quick to follow with both Charles Comiskey and Charlie Ebbets buying up land that would eventually host their namesakes.

    Along the way the baseball that follows is also nicely setup through opening chapters on the Giants (“Land of the Giants”) and the Cubs (“Origins of a Dynasty”) Murphy takes a look back at how each of the primary combatants in the ’08 race were built (the Giants not so fairly it turns out in a seedy story of destroying the Orioles and using the Reds concocted by John Brush, Andrew Freedman, and John McGraw) interspersed with fascinating profiles of McGraw, Frank Chance, and Johnny Evers. By the time the fourth chapter, titled “Opening Days”, rolls around the reader is well positioned to enjoy the drama that follows.

    Off the field the mood of the country and the times is set by the inclusion of six “Time-Outs” or sidebars that periodically appear at the ends of chapters. For example, “Chicago on the Make” closes out the chapter on the building of the Cubs and details the evolution of the city and its leaders in dealing with corruption at various levels that had become rampant by the turn of the century. In other time-outs Murphy recounts the grizzly affair of one of America’s first female serial killers, Belle Gunness, the Doubleday myth, the position and prospects of African-American ballplayers, the scare of early twentieth century anarchism, and finally an entertaining list of the things that some players did in 1908 to “court good luck and drive away hoodoos” (“hoodoos” being the term then in vogue and denoting curses and bad luck). Each is fascinating and provides just enough additional context to give the reader a feel for the place of the game in the first decade of the twentieth century.

    But of course the main thrust of the book is the narrative of the 1908 National League season and here Murphy does a fine job by breaking the season down into six chapters with two other chapters devoted specifically to Merkle games one and two with the latter chapter complete with a timeline beginning at dawn and running until game time that serves to build anticipation of the events that follow. But in the earlier chapters recounting the ups and downs of baseball’s long season, rather than focus only on the Giants and Cubs these chapters also take the time to highlight key moments and performers of other teams including Pirates shortstop Honus Wagner who in 1908 had his finest season (.354/.415/.542) while his team fell just short in what became a three-way race after a furious run that saw the Bucs win (13 of 14) before losing to the Cubs at the Westside Grounds 5-2 on October 4th admidst a little controversy. We also here find vignettes featuring Ty Cobb, Nap Lajoie, Hal Chase, Rube Waddell, and Cy Young among others not to mention other actors in the season’s ultimate drama such as Mordecai “Three Fingers” Brown, Roger Bresnahan, Joe Tinker, “Turkey” Mike Donlin, Jimmy Sheckard, Merkle of course, and “Giant Killer” Jack Pfiester who is handed the ball in both Merkel games. And even though the story of the Merkle games and to a lesser extent the season itself, has been told countless times, I’d rather not spoil any more of it since every fresh reading brings a new perspective and Murphy adds plenty of detail that I had either forgotten or had never known. As a final treat and one that fittingly puts a bookend not only on the season but on the personalities that defined the era, Murphy includes an epilogue that tracks the destinies of the major players, managers, and magnates after that special season.

    For me, one of the supreme pleasures of being a baseball fan is the way the game connects the past with the present, not only through its numbers, but through its places, stories, and the way that its seminal events are embedded in our culture. Baseball fans, and not just those rooting for the denizens of Wrigley Field, would be well served to remind themselves of how those connections were built and in a sense to maintain them by reading about one special season on its 100th anniversary.

    Monday, March 17, 2008

    Baseball's Trifecta

    This article originally appeared on Baseball Prospectus on September 28, 2006.



    September 28, 2006

    Schrodinger's Bat: Baseball's Trifecta
    by Dan Fox

    "I think if you come to the ballpark and you see Carl hit a triple, you've had a pretty good day. It's pretty much a signature play for him, because when he hits the ball down the line, or in the gap, he's thinking three. He never thinks two. He breaks triple. He wants triple, he takes triple."

    --Devil Rays manager Joe Maddon after Carl Crawford's triple on September 24.

    "Hey, big mouth, how do you spell triple?"

    --Shoeless Joe Jackson, to a heckling Cleveland fan who taunted him by asking if he could spell "illiterate." This was his response after hitting a triple.

    In the bottom of the sixth inning of last Sunday's Yankees/Devil Rays game, Carl Crawford pulled Mike Myers' 1-0 slider into the gap in right-center. The ball skidded past Bobby Abreu, and by the time he retrieved it and hit the cutoff man, both runners had scored and Crawford had coasted into third. It was his 15th triple of the season.

    As much as I disdain more or less arbitrary statistical milestones, the hit did draw some attention, since it made Crawford the first player in 76 years to hit at least 15 triples in three straight seasons. In that year, 1930, no fewer than three players were finishing a run of three or more years with 15 triples or more:


    1930 1929 1928 1927 1926
    ---------------------------------------------
    Earle Combs 22 15 21 23
    Paul Waner 18 15 19 18 22
    Charlie Gehringer 15 19 16


    By comparison, Crawford hit 19 triples in 2004, 15 last year, and now 15 this season. When asked after the game why he thought it had been so long since a player accomplished the feat, Crawford replied, "There are fast guys in the game who can hit, so I have no clue why guys haven't done it. That's not a stat that you go out and try to do every year. That's a stat that just happens."

    Crawford's achievement and his comment provide a springboard for this week's column, where we'll discuss triples and their accompanying historical trends.

    Historically Speaking

    The simple and somewhat tautological answer to Crawford's consternation regarding the lack of triples is that the triple has become increasingly rare over time. And just as a rising tide lifts all boats, a low tide grounds them. The following graph shows the number of triples per 500 at-bats plus walks for each year from 1901 through 2005:



    Notice that, as it did for offense in general, the robust environment of 1930 marked the high point for triples, with 6.8 triples hit per 500 AB+BB. The rate dropped immiediatly thereafter, to 5.7 in 1931 and 1932, and it never again reached as high as 5.3. It now seems to have stabilized at around 2.5.

    If there are fewer triples being hit, then it becomes less likely that an individual player will be able to hit 15 in three consecutive seasons. For example, a player who hits 15 triples would have a rate of 12.5 per 500 AB+BB. In 1930, a player who hit triples at 1.8 times the rate of the average player would end up with 15 triples, and eleven of the 73 players with 500 or more AB+BB hit 15 or more triples in 1930. In 2005, however, a 15-triple player would have to hit triples at a rate more than five times that of the average player, and just two players (Crawford and Jose Reyes, who hit 17) out of 140 with 500 or more AB+BB could do that. It should be noted that 2005 was a comparatively good year for three-baggers: since 1992, there have been ten seasons in which no player has hit 15 triples. In contrast, when 10-15% of the players hit 15 or more triples every year, there is a very good chance that one or more of those players will repeat for three consecutive years.

    As an aside, the general reduction in triples makes the performance of Cory Sullivan on April 9 even more of a fluke. In the top of the fifth inning, Sullivan hit two triples in a seven-run outburst that helped the Rockies beat Jake Peavy and the Padres 10-4. Those two triples tied a record held by ten others, although most recently accomplished by the Senators' Gil Coan in 1951.

    The graph also shows the spike in triples that occurred between the years 1974 and 1980. As you can see, triples had been declining steadily since 1930, reaching a low of 2.72 in 1973. From there they began to climb again, reaching a high point of 3.71 in 1977 before gradually declining to settle back down at the 1973 level by 1986. Note that 1977, like 1930, was a relatively big offensive year, with teams scoring 4.47 runs per game. Offensive levels continued to rise throughout the period, and so it can't simply be chalked up to more hits resulting in more triples. But since there weren't new parks being introduced, and expansion occurred in the middle of the spike and not at its start, it's not obvious what might have caused it the outburst.

    At first blush, one might posit that there was a general trend towards valuing speed that began in the early to mid-1970s, as young players like Ron LeFlore, Tim Raines, Willie Wilson, and Omar Moreno began to establish themselves. The increase in stolen bases, however, is more gradual than that for triples, and actually began around 1959 (with the "Go-Go Sox" and Maury Wills playing a large part) with a steeper increase in the mid-1970s that peaked in another year that was good for offense, 1987, as shown in the following graph:



    While all of this is interesting, it kind of tiptoes around the answer that Crawford is looking for (he probably doesn't really care, but play along). Although there doesn't appear to be consensus among the analytical community, the following are the theories most often discussed as to why the triple has become relatively rare:


    • Better Fielders. One of the more interesting questions is to consider how the game has changed as the players have become more athletic. Clearly the speed, strength, size and athletic ability of the average professional baseball player in 2005 exceed that of one in 1920. The question is, how does this affect the game and the evaluation of performance? This was recently touched on by Phil Birnbaum on his Sabermetric Research blog, and was the subject of a thought-provoking chapter by Nate Silver in Baseball Between the Numbers.

      As an example of such an effect, the late Stephen Jay Gould argued that a rising level of play inching closer to the "right-wall" of human ability coupled with stabilization of the game itself have conspired to decrease the variability in seasonal batting averages, making it far more difficult to hit .400 now than in years past. The epitome of Gould's argument is that Tony Gwynn had less opportunity than Ty Cobb to exploit the inferiority of others.

      Something like this may be happening with triples as well. The theory is that as fielders have become bigger, faster, and boast better throwing arms, would-be triple hitters have had a more difficult time exploiting their opponents, and thus rack up fewer three-baggers. In addition, the standardization of positioning (including the idea that outfielders played shallower in the past) and cutoffs have added to the difficulty. Although baserunners have also become faster, this theory would argue that the improvement in fielding ability and techniques has outstripped the increase in baserunner speed.

    • Park Configuration. This is a corollary to the first theory. Early in the century, ballpark dimensions were far less standardized than today. For example, the Huntington Avenue Grounds where the Red Sox played from 1901-1911 featured a left-center field fence 440 feet away, and a centerfield wall 530 feet from home plate from 1901-1907, and then at 635 feet starting in 1908. Similarly, the center field fence at Forbes Field was 462 feet away in 1909, and at the Polo Grounds, center field ranged from 430 feet in 1931 to 505 feet in 1949. Don't forget that while these and other ballparks in the two eight-team leagues had one or more long distances, they also had lots of corners and edges that made for unpredictable caroms. All of this adds up to situations which surely allowed hitters more opportunity to leg out triples.

      Over time, standard dimensions (335/375/400/375/335) made their way into the game, diminishing the opportunity for strange bounces and balls rolling towards distant fences with outfielders in hot pursuit. For my money, the combination of this and the first cause probably explain the lion's share of the overall historical trend.

    • Risk Aversion. As mentioned previously, as offensive levels rise, the relative importance of stolen bases decrease. The same reason causes triples to decline in value; the marginal benefit of stretching a double into a triple is lessened as the probability of scoring from second base increases. A quick look at Run Expectancy Matrices from various years would bear this out as well as the graph presented in my column on Win Expectancy. The argument, then, is that as offensive levels have risen over time, triples have decreased as a result of their lessening strategic importance.

      While the premise of this theory is certainly true, one doubts whether calculations like this are taken into account either consciously or subconsciously. More problematic, however, is the fact that runs per game have not increased over time, therefore cutting the legs out from under this theory. Contrary to the steadily downward-sloping line in the first graph, run scoring was actually higher throughout the 1920s and into the early 1930s than at any other time, and after diminishing to reach its low point in 1968 (3.42 runs per game per team), it has steadily increased since then:



    • Player Aging. No discussion of triples would be complete without at least a brief look at the effect of age. As Clay Davenport noted in his essay "Graying the Game" in Baseball Prospectus 2002, and subsequently reinforced by Nate Silver last season, the player population is aging, and has been for quite some time. This has an impact on triples, since older players lose foot speed and don't hit as many as younger ones do. The graph below shows triples per 500 AB+BB for all players since 1901, player-seasons from 1901-1935 and seasons from 1936-2005:



      Even at a time when triples were much more common (the orange line), triples peaked at age 22 and steadily declined through age 40. You'll also notice that the slope of the line for players in the first part of the 20th century is not quite so steep as it is for those since. I also find it very interesting that the slope of the line from ages 22 through 35 for all players is very nearly straight, indicating an extremely uniform decrease with age.

      Clearly, players don't hit as many triples as they get older, but the general aging of the player population cannot account for the overall decrease in triples. Just considering players 25 years old or younger, those who played since 1936 hit triples at a rate of 3.72 per 500 AB+BB, while those who played before 1936 hit them at a rate of 6.73. Keep in mind that it's also very likely that the average speed of players 25 years old and younger in the major leagues today is greater than that in the Deadball Era, meaning that other factors such as fielding prowess and changing park configurations are much more important to the overall trend.


    "How do you spell triple?"

    The triple is often called the most exciting play in baseball, and for good reason. There is no play that involves as many players, lasts as long, and concludes so often with a bang-bang crescendo. As we've seen, there are a variety of reasons that have conspired to make it a much rarer event today than it was in days past. These include increased standardization and ability on defense, less variability in park dimensions, risk aversion, and perhaps an aging player population. Whatever the combination and relative importance of these different causes, rather than wring our hands at its disappearance, let's instead appreciate the feat for its increased difficulty and marvel at those, like Carl Crawford, who can do it with regularity.

    Tuesday, January 22, 2008

    The Catch

    Saw a link to this photo come across the SABR listserv with the author wondering whether this really does depict "The Catch" made by Willie Mays in game one of the 1954 World Series. I hadn't seen this photo before and if anyone has any comments on it I'll pass them along.



    And for those interested in the background of "The Catch", here's a snippet from the Ken Burns documentary featuring George Will and Bob Costas.

    Saturday, January 12, 2008

    Strike Zones, Trilobites, and a Vicious Cycle

    Last week I ran the first in a series of three columns I wrote on hit batsmen. Today it's time for the second in the series originally published In May of 2006. Enjoy.




    May 11, 2006
    Schrodinger's Bat: Strike Zones, Trilobites, and a Vicious Cycle
    by Dan Fox

    "If they knocked two of our guys down, I'd get four. You have to protect your hitters."
    --Don Drysdale

    "I hated to bat against Drysdale. After he hit you he'd come around, look at the bruise on your arm and say, 'Do you want me to sign it?'"
    -- Mickey Mantle

    In our last installment of Schrödinger’s Bat we began an investigation of hit batsmen by looking at the big-picture trends in the rate of hit batsmen since 1901. That exploration led to summarizing various theories that have been proposed over the years to explain the fluctuation of rates, including the physical hazard theory, the offensive context theory, the intimidation theory, the expansion theory, the new strike zone theory, and finally the aluminum theory. From among that group, we can say that the last one seemed to make sense for the recent upward trend that began circa 1985.

    Although I promised that this week we’d scrutinize the differences in hit batsmen rates since the introduction of the designated hitter in 1973, and discuss the theories proposed to explain it, last week’s column generated such a large volume of email that I thought it would be worth spending one more column on the big picture before moving on to the DH era.

    Big Picture Trends Redux
    Let’s start off by addressing a few of the more prevalent reader questions regarding the bevy of big picture trends discussed last week. Indicative of the questions received was this one from reader Marc Stone, where Marc touches on two aspects of HBP trends that the article overlooked.

    Nice job, Dan, but you left out one very useful comparison: how do changes in HBP compare to changes in BB rates and, to a lesser extent, K rates and pitches per PA.

    Reader Ryan Tippetts echoed the second part of that question by noting:

    My immediate thought, specifically regarding recent upward trends, was the modern trend of increased pitches per AB. Might it be as simple as because a batter sees more pitches he has more opportunities to be hit by a pitch?

    Thanks to Ryan and Marc, and to all the other readers who had similar comments. I have to admit that neither looking at walk and strikeout rates nor at pitches per plate appearance in comparison with the rate of hit batsmen had occurred to me. But of course all three suggestions make a lot of sense:


    • If pitchers are walking more batters at the same time they’re hitting more of them, that may be indicative of worse control (the “wildness theory”).

    • If strikeouts are strongly correlated with hit batsmen, then perhaps a more aggressive hitting style (the “free swinger theory”), or the intimidation of the HBP, or even changes in the strike zone are playing a role.

    • If pitchers are throwing more pitches overall, it does indeed provide more opportunity for hitters to get plunked (the “opportunity theory”) which in the end may be all that is required.


    To see whether the wildness or free swinger theories shed any light on the question of changes in HBP rates over time, we can add unintentional walks and strikeouts per 1,000 plate appearances for each league to the graph we showed last week:



    What you’ll notice is that up until around 1970, there appears to be some correlation between walk rate and HBP rate. Unfortunately, the correlation is the inverse of that which the wildness theory would predict. As walk rates increased from around 1920 through the late 1940s the rate of hit batsmen fell. As walk rates declined, the frequency with which batters were hit increased.

    In other words, one might be inclined to conclude that there is a more or less constant rate at which pitchers put batters on for free via the HBP or unintentional walk, at least based on the graph from 1901 through 1970. While that’s an attractive idea, and akin to the offensive context theory discussed last week, you can’t simply add the two rates, since hit batsmen are so much less frequent than walks--as evidenced by the fact that in order to get both on the graph, the scale of HBP is per 1,000 PA while that for walks is per 100 PA. As a result, the number of runners that pitchers put on for free is driven almost entirely by the number of walks.

    In any case, there appears to be no correlation over the past 35 years, as walk rates have been fairly steady, while the number of hit batsmen has increased dramatically.

    On the other hand, the free-swinger theory appears more promising. Strikeout rate does correlate pretty strongly with the HBP rate since around 1950, and in the 1910-1925 period as well. In fact, from 1950 through 2005 the correlation coefficients are a very healthy .72 and .69 for the American and National Leagues respectively, which can be interpreted to mean that strikeout rates explain around 50% (.702) of the variation in HBP rates (or vice versa).

    But as every statistics professor drums into the heads of his students, correlation is not necessarily causation, and before 1950 the correlation is much weaker--in fact, for the preceding 25 years the two rates were moving in opposite directions. As a result, one might argue that the free-swinger theory holds since 1950 because the normative hitting style became more aggressive, resulting in hitters diving over the plate more frequently, which in turns results in more hit batsmen. Under this interpretation, during the 1970-1984 period, free swinging was less in vogue, and pitchers reacted with fewer brushback pitches, resulting in fewer HBP.

    An alternative theory noted by reader JMHawkins that would fit the same set of facts holds that an expanding strike zone, especially on the outside corner, forces hitters to stand closer to the plate and dive over it more frequently, resulting in more batters being hit. The expanded zone also happens to induce more strikeouts, so strikeout rate and HBP rate aren’t causally related, but both are related to this third factor. There is undisputed evidence that the strike zone expanded in 1963, and anecdotal evidence that the low outside corner became an increasingly rewarding target for pitchers in the last 20 years or so. As umpires reigned in the zone after the redefinition in 1969 and the increased scrutiny around 2001, both strikeouts and hit batsmen fell. This “fluctuating strike zone theory” then explains why strikeout and HBP rate seem to mirror each other.

    In either case, we’d still need a theory to account for the preceding 25 years, when strikeouts rose and hit batsmen fell, although under the above theory it appears that those 25 years from 1925 to around 1950 are the exception and not the rule.

    To be honest, I was initially most hopeful about the opportunity theory. It's pretty well known that the number of pitches per plate appearance has been on the rise, so it makes intuitive sense, but when we try to look at this theory, we run into the problem that we don’t have complete play-by-play data--and hence pitch counts--for most of baseball's history. Despite the recent and very welcome additions to the work being done at Retrosheet we are still missing the vast majority of the data required to complete the picture from 1901 through 2005; the 49 seasons that Retrosheet provides are often missing pitch sequence data.

    Some alert readers (aka, the real stat geeks) may also be thinking that perhaps we could use pitch count estimators in order to estimate the number of pitches, and hence the rate at which batters are hit per pitch. Unfortunately, the basic estimators that are in use rely on constant multipliers for strikeouts and walks to estimate the number of pitches, and we’ve already taken those into account in the graph above. More complex estimators rely on estimates of balls-in-play rate (the percentage of pitches on which balls are put into play, which varies by league and year), which we don’t have historically. There are other factors that could also influence the result which models have difficulty capturing.

    However, we can look at data we do have, and that's as far back as 1988. You’ll recall that during the 1988-2005 period HBP rates have more than doubled. What we find, however, is that during that time the number of pitches per plate appearance has risen only around 5%. So it doesn’t look like the opportunity theory explains at least the most recent upward trend.


    Year P/PA
    1988 3.60
    1989 3.63
    1990 3.64
    1991 3.71
    1992 3.68
    1993 3.68
    1994 3.75
    1995 3.75
    1996 3.75
    1997 3.76
    1998 3.70
    2000 3.75
    2001 3.72
    2002 3.73
    2003 3.74
    2004 3.76
    2005 3.73


    What do Trilobites and Jason Kendall Have in Common?
    Although the free-swinger and fluctuating strike zone theories (or some combination thereof) provides some insight, and the opportunity and wildness theories perhaps less so, the most often cited theory by readers not discussed in last week’s column is the “body armor theory.” A succinct explanation was provided by reader Jeff Bullington:

    This would only affect the recent rise, but what about the increased use of body armor? Would this be the 'contra-intimidation theory'?

    As Jeff noted, this is the polar opposite of the intimidation theory and holds that as hitters began to wear more and more protective gear, they’ve been less afraid of getting hit, allowing them to stand closer to the plate and be more aggressive about hanging in. It follows logically that pitchers would respond by upping the ante in an effort to move batters off the plate, and reclaim their rightful territory.

    This idea is akin to the evolutionary arms race between predator and prey, whereby one species evolves stronger protection in response to selection pressure from predators as has been speculated for trilobites, which in turn leads to selection pressure on predators to evolve accordingly.

    As arguments go, this is a particularly difficult one to measure quantitatively. What we can certainly see that the use of protective gear--such as hard elbow and shin pads--has increased in the past 20 years. One only has to look at the protection worn by Craig Biggio, or Jason Kendall and consider his recent run-in with John Lackey to understand how that protection might affect the game. It’s probably not a coincidence that coming into 2006, Biggio's 273 HBPs rank second all-time, and Kendall ranks 8th with 197.

    That said, in 2002 Major League Baseball began enforcing rules that limited the use of protective gear to players with medical exemptions, such as the one employed by Barry Bonds, which allows him to wear his elbow armor. The rules also limited the size of the various pads and devices worn.

    Whether coincidentally or not, the recent Kendall incident notwithstanding, the rate of hit batsmen has stabilized since that time. This was also immediately after the rate had reached its apogee in 2001, when the AL set its all-time record in hit batsmen per 1,000 plate appearances and the NL its highest total since 1901.


    AL NL
    2001 10.67 9.92
    2002 9.90 9.17
    2003 10.21 9.86
    2004 10.40 9.60
    2005 9.52 10.05


    We can also note that although helmets have been mandatory for MLB players since 1956, ear flaps have only been enforced for players who reached the majors after 1983. Ear flaps do coincide with the recent upward trend, and although one can imagine there would be an attendant psychological boost for the hitter, it’s more difficult to believe that this relatively minor change would have had that large of an immediate impact. After all, players already in the league were allowed to use the old-style helmets, so the change was gradually phased in, and the head is the part of the body hit with the least frequency.

    But this does provide the opportunity to sneak in a quick trivia question: Who was the last player to wear a helmet without an earflap in a game and in what year? (Wait for it, we'll get to the answer at the bottom of the column.)

    So, whether or not body armor and the introduction of the ear flap is responsible for the twenty-year upward trend in HBP rates or not, an argument can be made that the crackdown on body armor has played a role in retarding the arms race.

    A Vicious Circle?
    Finally, reader Jake Slemp wrote to say that whatever the cause of an increasing or decreasing trend in hit batsmen, it would likely be self-sustaining and reinforcing. His reasoning:

    After all, hit batsmen beget more hit batsmen within the same game, which often beget still more in subsequent games between the two teams…which beget more in those games, etc.


    In other words, even a small increase in hit batsmen might form a feedback loop based on retaliation. This situation is often described in economic terms as a virtuous (if the results are favorable) or a vicious (if they are negative) circle, where each cycle continues the trend in the current direction until stopped by some outside force.

    To look at this “vicious circle theory,” we can use play-by-play data for 2001 through 2005 to examine the distribution of games by the number of hit batsmen. We can then compare the actual distribution with what would be expected if the hit batsmen were distributed randomly (in a binomial distribution) given the overall rate of HBP and the average number of plate appearances per game. What we find when we do so is as follows:


    HBP Games Expected
    7 1 0
    6 1 1
    5 10 10
    4 118 71
    3 455 394
    2 1626 1610
    1 3980 4325
    0 5953 5732
    6191 6412


    As you can see, the number of games where zero through two batters are hit are all pretty much in line with what would be expected. However, we do see that the frequency of three and especially four batters hit in a game surpass the numbers you'd expect, and there are fewer games with a single batter hit than expected. And of course this list provides the opportunity for a second trivia question: What teams were involved in the lone seven hit batsmen game of the past five years? (Again, answer appears at the bottom.)

    What this confirms is that retaliation is a likely factor in hit batsmen. Games where we would otherwise expect two batters to be hit can quickly turn into games where three or four are hit. We already knew that intuitively, but what we need to know is whether or not increased retaliation is responsible for the increasing number of hit batsmen.

    To look at this, we can calculate the expected number of games with various numbers of hit batsmen over four successive periods, starting in 1985.


    Actual vs 1985-1989 1990-1994 1995-2000* 2001-2005
    Expected
    5+ 850% 246% 322% 104%
    3 - 4 162% 125% 119% 123%
    0 - 2 100% 100% 99% 99%

    * Does not include 1997-1999.

    As we saw with the 2001-2005 period, in all periods there are just about the expected number of games with zero, one, or two HBP. However, there are always more games than expected with three or four batters hit, and lots more with five or more hit.

    While this confirms that retaliation within games is probably a persistent feature of hit batsmen, it doesn’t appear as if blatant retaliation has increased over the past twenty years. Keep in mind, the HBP rate has doubled during that time frame. If anything, it would appear there are slightly fewer beanball wars now than in the past, perhaps as a result of the double-warning rule put into effect in 1994. Note that this conclusion holds even if you assume that the increase in games with three or more hit batsmen is completely due to wildness (after all, it’s certainly true that when a pitcher hits one batter he’s more likely to hit another simply due to control problems).

    What this doesn’t rule out is the idea that teams now employ a more subtle form of retaliation, whereby they will wait to take revenge in a subsequent series, and where the retaliation doesn’t escalate out of control. As a result, it would be possible that retaliation and escalation are to blame for the recent increase in hit batsmen, but it seems unlikely.

    However, even if retaliation is not the cause of the increasing rate of hit batsmen, the body armor theory may provide the starting point for the vicious circle that was interrupted by the new rules, starting in 2002.

    Error on the Side of Caution
    If nothing else, I hope that we’ve highlighted that in an activity as complex as baseball, there are usually many factors that contribute to the big-picture trends that we see. That’s true for hit batsmen as well as the more visible trends, like the offensive upsurge of the last dozen years or so. If there is a lesson to be learned here, it’s probably that we should all be more cautious of simple explanations and easy answers.

    Let’s wrap up with a couple of corrections from last week.

    First, when discussing the expansion theory I noted that expansion would have a tendency to dilute talent in both leagues. While that’s true to some extent, I was reminded by our own Christina Kahrl that actually the 1992 expansion draft was the first time players from both leagues were available in an expansion draft. Prior to that, for example in 1977, the expansion teams could only choose unprotected players from their own league. And in that 1992 draft, AL teams were able to protect more players than NL teams; it was not until the 1997 draft that all teams were able to protect the same number of players.

    Second, I noted last week that Ray Chapman was the only professional player ever fatally injured in a game. Reader Bill Johnson pointed out that Chapman was the only major-leaguer to be fatally injured by a beanball. Several minor leaguers were killed in the 1950s and 1960s including Otis Johnson in 1951.
    ---

    Okay, so you waited, here are a couple of answers. For the first trivia question, Tim Raines never wore an earflap in a 23-year career that spanned from 1979 through 2002. As quoted in a MLB article documenting it, he did not wear one because, being a switch hitter, he didn’t want to carry two helmets.

    The answer to the second question: June 7, 2001 the A’s visited Anaheim to take on the Angels. In that game Jason Giambi was hit by Scott Schoeneweis following a first-inning home run by Frank Menechino. In the third inning, Schoenweis then hit Menechino (one wonders if accidentally) and later in the inning also hit Olmedo Saenz. Barry Zito subsequently hit Tim Salmon in the 6th. Almost certainly not coincidentally, Schoeneweis again hit Menechino leading off the 8th. Later in that same inning, Mike Holtz entered the game and promptly plunked Eric Chavez for good measure. And just to round things out Scott Spiezio was hit by Mark Guthrie in the bottom of the 8th. Ouch.

    Wednesday, January 09, 2008

    Beautiful Theories and Ugly Facts

    Another golden oldie from the Baseball Prospectus archives originally published on May 4, 2006



    Schrodinger's Bat: Beautiful Theories and Ugly Facts
    by Dan Fox
    May 4, 2006

    “The great tragedy of Science--the slaying of a beautiful hypothesis by an ugly fact.”

    --British biologist Thomas H. Huxley (1825-1895)

    On April 22nd, Rockies setup man Jose Mesa drilled Giants shortstop Omar Vizquel in the back with his first pitch. The next day, Giants starter Matt Morris hit both Matt Holliday and Eli Marrero in the first eight pitches he threw and was tossed from the game, along with manager Felipe Alou and pitching coach Dave Righetti. That was followed by the customary warnings to both teams, in observance of the practice that Major League baseball adopted in 1994.

    Later in the game, Jeff Francis hit Steve Finley and was not ejected, much to the consternation of what was left of the Giants coaching staff. Of course, under the double warning rule, the umpires still have discretion over whether to eject a pitcher after the warnings have been issued; a discretion that yours truly thinks is not exercised nearly as often as it should be. Finally, Ray King plunked Vizquel again in the 8th, and was ejected along with Rockies skipper Clint Hurdle.

    The Mesa/Vizquel feud dates back to 1998, when the two were still teammates with the Indians and Vizquel celebrated a spring training home run off of Mesa by doing a cartwheel afterwards. Things went downhill after the 2002 publication of Vizquel’s book Omar! My Life On and Off the Field, wherein Vizquel said of Mesa’s performance in Game Seven of the 1997 World Series:

    "The eyes of the world were focused on every move we made. Unfortunately, Jose's own eyes were vacant. Completely empty. Nobody home. You could almost see right through him. Not long after I looked into his vacant eyes, he blew the save and the Marlins tied the game.”

    Well, at least no one can accuse Vizquel of being the model teammate.

    Mesa then vowed to hit Vizquel every time he faced him, and he did exactly that on June 12, 2002, in the 9th inning of a 7-3 game when Mesa was pitching for the Phillies. And he hit him the next time the two faced each other, which was two Saturdays ago in Denver.

    Mesa is now appealing a four-game suspension handed down by Bob Watson. I kid you not, Rockies GM Dan O’Dowd said on the Rockies radio pre-game show on April 29th that he was surprised Mesa was suspended, and that he didn’t think Mesa was throwing at Vizquel. I know GMs like to stand by their players, but really…

    Putting the emotions and politics aside, of the more than 14,600 games that have been played since the beginning of the 2000 season, the April 23rd game marks the 138th time that four or more batters have been hit in the same game. Pondering that fact led me to take up the topic of hit batsmen in this week’s column.

    A Pair of Trends
    To lead off, it’s always good to have a historical perspective. In that vein, I offer the following graph that shows the number of hit batsmen per 1,000 plate appearances in both the American and National Leagues since 1901.



    There are several interesting aspects to this graph that lead us to ask two primary questions.

    First, you’ll notice that the number of hit batsmen has fluctuated fairly widely over time, with a high of 10.67 per 1,000 plate appearances in the American League in 2001 to a low of 2.82 in the American League in 1947. The rate at which batters were hit decreased steadily from the turn of last century through the late 1940s, and then increased for the next twenty years to a peak in 1968. It then decreased again until the early 1980s, but from 1985 it rose quickly through 2001, to a rate where it has since leveled off.

    We humans love causal explanation for apparent trends like this, so the first question that comes to mind is: just what is it that can explain these changes over time?

    Secondly, as you can see, batters have historically been hit at slightly different rates in the two leagues, with the American League seeing more hit batsmen from 1909 through 1928, and the National League then doing so until 1950. The leagues then traded the title back and forth until 1970 when the AL would lead for more than 20 years until the strike-shortened 1994 season. Since that time the back and forth has returned, with the AL leading seven times and the NL five. The second question then is: what are we to make of these differences between the leagues?

    In the remainder of this week’s column we’ll tackle the first question related to the overall historical trends, and leave the second--which deals with league differences--for next week.

    The Big Picture Trend
    There have been a number of theories proposed attempting to explain the historical trends we see in the rate of hit batsmen. Let’s look at them.

    On August 16, 1920 Carl Mays of the Yankees hit Ray Chapman of the Indians in the head with a pitch. The next day, Chapman died and became the only professional player ever fatally injured in a game. Although Mays was vilified in some quarters, dirty balls were also held responsible; as a result, umpires began to replace balls that had been dirtied much more often in-game.

    At first reflection, any baseball fan might assume that this tragic event would have had an immediate impact on the way the game was played, with the result being that more pitchers were afraid to throw inside, which would reduce the number of hit batsmen. Additionally, fewer soiled balls in play would theoretically allow for their being spotted more easily by hitters, which might allow them to duck, dive, or dodge the inside pitch. In either case, we’ll call this the “physical hazard” theory to explain the reduction in hit batsmen.

    While it’s a nice theory, you can see from the graph that the longer trend in the reduction of batters hit had been operative in the American League since 1911, and in the National League stretching all the way back to 1901. In fact, contrary to the theory that the Chapman beaning may have had a dampening effect, a closer examination of the period between 1919 and 1925 reveals that hit batsmen per 1,000 plate appearances actually briefly went up the year following the beaning (1921) through 1923, before resuming its downward trend.


    AL NL
    1919 6.80 6.28
    1920 6.49 5.76
    1921 6.76 5.12
    1922 7.22 5.62
    1923 7.35 5.62
    1924 6.94 4.99
    1925 5.67 4.90


    So the physical hazard theory seems to have little validity. From this, one might then reason that if that monumental event didn’t signal a change then it’s unlikely that any other isolated incident or play would have, either.

    So what about a broader theory that takes into account a cost/benefit valuation of hitting batters? For example, it could be the case that pitchers adjusted their frequency of hitting opposing batters based on their recognizing the costs of doing so. In times where runs are scarce, hitting a batter would cost relatively more than when runs are plentiful, since there is a greater probability that the batter would have been put out had they not been hit. The result is that there would be fewer hit batsmen in depressed offensive environments, and more in inflated environments. Sounds like a reasonable idea and we’ll dub it the “offensive context theory.”

    We can test this theory by taking a look at the cost of hitting a batter in terms of the Win Expectancy Framework (WX) for both the American and National Leagues since 1901. The framework allows us to estimate how much a hit by pitch is worth in terms of wins and we can then graph the results for both leagues.



    As you might have guessed, the increase in Win Expectancy for each hit batsman was high in the Deadball Era at over 3%, and then decreased from the early 1920s until the late 1930s as offensive levels rose, reaching a low point just over 2.6%. The values then began to climb again, reaching over 3% in the 1960s, and after a brief spike in 1989 fell as offensive levels rose again.

    So, does the offensive context theory hold water? If you were to overlay these two graphs you would find little in common. For example, the rate of hit batsmen in the Deadball Era declined steadily, even though the cost remained fairly constant until the offensive explosion of 1920. Offensive levels then began to decline in the late 1930s, making the cost of hitting a batter rise, although we find that hit batsmen rates continued to decline into the late 1940s. And again, as the cost of hitting batters rose in the 1950s and from 1993 on, more batters were being hit. In fact, the WX value of a hit by pitch turns out to have almost zero correlation with the rate at which batters are hit. Another beautiful theory spoiled by some ugly facts.

    Okay, offensive levels don’t seem to drive HBP rates, but what if an increased rate of hitting batters has the effect of depressing offense, and vice versa? We’ll label this the “intimidation theory.” After all, offensive levels rose as batters were being hit less often throughout the 1920s, and run-scoring dropped as batters were being hit more often in the 1960s. Many former players, especially those who had the “pleasure” of facing Don Drysdale and Bob Gibson, tend to favor this theory.

    Unfortunately, the intimidation theory has the same underlying problem as the one that preceded it. While the examples cited in the previous paragraph seem to make sense, the theory fails to explain why hit batsmen declined throughout the Deadball Era, and why in the offensive eras of the 1950s and post-1993 the rate of hitting batters has actually increased.

    Another theory that is popular, and one that we’ll tackle in next week’s column, is that since 1973 and the introduction of the designated hitter, hit batsman have been on the rise since the pitcher does not himself face the consequences of hitting opposing batters. This is the so called “moral hazard theory.” A quick glance at the first graph militates this idea, however, since the HBP rate actually began to decline in 1969, and continued to do so through the first eleven years of the DH. In addition, the rate rose and fell in both leagues, rather than affecting only the AL as you would expect.

    A couple years ago, J.C. Bradbury of the excellent blog Sabernomics along with Doug Drinen studied the issue of HBP differences using play-by-play data. One of the conclusions they came to was that talent dilution as the result of the 1993 expansion draft contributed to the rise in hit batsmen post 1993. The theory is that a greater percentage of pitchers with less experience produce more accidental hit batsmen. At first glance this “expansion theory” makes a lot of sense. Take a look at the following table that lists each expansion event along with the rates the year prior to as well as the first year of the expansion.


    Pre Post Diff
    AL 1960 5.76 AL 1961 5.22 -0.54
    NL 1961 5.48 NL 1962 6.11 +0.63
    AL 1976 5.18 AL 1977 5.42 +0.24
    NL 1992 5.48 NL 1993 6.66 +1.18
    NL 1997 9.02 NL 1998 8.38 -0.64
    AL 1997 7.78 AL 1998 8.77 +0.99


    In all but two instances, the rate of hitting batters went up in the league to which baseball added teams. It should be noted that in the first four expansions the league that did not expand also saw their rate increase, which you might expect since expansion in one league also dilutes talent in the other.

    What this table doesn’t show--though it's captured in the graph--is that the overall trends in each case were not really affected. When expansion came to the AL in 1961 and the NL in 1962 hit batsmen were already on the rise. When the AL expanded in 1977 the rates were declining and continued to do so after 1977. In both 1993 and 1998 the rates had already been increasing since 1985, and so while expansion may have egged on the increase, it clearly wasn’t the only factor. In other words, expansion did not signal a change in direction of trends that were already underway. As a result, it doesn’t appear that the expansion theory can be invoked as a general explanation and in any case can’t be invoked to shed any light on the trends prior to 1961 when both leagues had eight teams.

    Finally, there have been articles in the popular press over the past few years that argue that a confluence of factors is responsible for the increasing rate at which batters are being brushed back. For example, a 2003 article from USA Today argued that a 2000 directive from Major League Baseball to change how umpires called strikes (in order to conform more closely to the rule-book definition) was the primary culprit. The “new strike zone theory” contends that adhering to the traditional definition has resulted in calling more strikes on the inside corner, and that pitchers are taking advantage of the fact, with hitters being plunked more often as they dive out over the plate in an attempt to hit what used to be strikes off the outside corner. Unfortunately for the new strike zone theory (at least as a single explanation), the increase in batters being plunked can be traced to almost 15 years before the “new” strike zone was implemented.

    In addition, if you’re looking for single causes, one might imagine that the double-warning rule instituted in 1994 would have a dampening effect on hit batsmen. After a warning, pitchers might be wary of throwing at or near guys when they would almost certainly be ejected. However, although the rate went down slightly in 1994 in the AL, it did not in NL, and after that continued its upward trend.

    Another factor mentioned in the article, however, appears to be more promising. First, the article speculates that a generation of pitchers accustomed to pitching to hitters with aluminum bats don’t go inside as often, since doing so is less effective when hitters can still fist a ball on their hands for a hit using a bat that doesn’t shatter. As a result of this “aluminum theory,” hitters have adjusted to looking for pitches over the outside corner, and therefore dive at the ball and stand closer to the plate. When this style of hitting is coupled with pitchers who, at the professional level, finally do try and pitch inside but do poorly at it, you end up with lots more batters being hit.

    What is satisfying about this theory is that it accounts for the recent rise in HBP rates in both leagues and seems to have timing on its side. Although the first patent for a metal bat was granted in 1924, Worth didn’t introduce the first aluminum bat until 1970, and it wasn’t until the late 1970s that bats by Worth (and, especially, Easton) significantly increased the popularity of aluminum bats. Seeing the rates begin to climb five to ten years later would seem to therefore be in line.

    Systemic Theories
    In the end, theories like the aluminum bat theory are the kinds of systemic explanations that seem to be needed to explain shifts in the game such as those related to hit batsmen. Instead of looking for single incidents such as the physical hazard or strike zone theories, or very subtle causes like the offensive context or intimidation theories, what we should probably be looking for are systematic changes in how the game is played, changes that may even originate well before players reach the professional level. While I don’t have any immediate answers for the forty-year decline in the first part of last century, or the increase during the following twenty years, I think those lines of inquiry will prove to be more promising, and the theories they produce less likely to be the victim of a few inconvenient facts.