Dan Agonistes: 01/01/2007

Tuesday, January 30, 2007

Plunking Explosion

Steve Treder over at The Hardball Times has a good article today on hit batsmen and how it has changed over time. He reviews many of the arguments that I discussed in my series last summer...

Schrodinger's Bat: Beautiful Theories and Ugly Facts

Schrodinger's Bat: Strike Zones, Trilobites, and a Vicious Cycle

Schrodinger's Bat: The Moral Hazards of the Hit Batsmen

As a conservative I appreciate Steves use of the Law of Unintended Consequences in positing that the introduction of batting helmets in the early 1960s and the adoption of the "zero-brushback-tolerance protocol" of the 1990s ironically may have contributed to the increasing rate at which hitters are plunked. This argument would also hold for the increased ability of players to wear body armor thereby leading to a kind of arms race in which hitters stand closer and pitchers try and back them off.

It struck me, however, in light of my columns the past two weeks, that the increasing size of major league hitters may also play a role all on its own, especially in the past 30 years. Larger hitters would likely be less afraid of being hit and observational evidence tells me that hitters today do less to avoid being hit than did hitters in the past. This struck home to me as I watched the footage from the 1954 World Series the other night.

I also noted that J.C. Bradbury mentions that he discusses how the distribution of talent in baseball has affected the HBP rate in his new book which should be available in mid-March. I'm looking forward to giving it a read.

Update: Baseball Musings has a little post on this subject and two interesting graphs. To me, the first illustrates that the rate of HBP has affected both low and high ERA pitchers roughly equally over the course of baseball history although in the last few years it seems to have diverged. The second graph is an illustration of how inferior pitchers now pick up more innings than they did in the past. It should be cautioned, however, that the increasing ERA of the leagues as a whole will cause some of this as the sub 5.00 ERA group shrinks and the +5.00 ERA group grows. Historically the +5.00 group would be very small simply because pitchers with ERAs that high would be so far from the mean.

Friday, January 26, 2007

Mid-Course Corrections

So here is something I don't understand (I know that will come as no surprise to many). In researching the prevalence of in-season managerial changes for teams that made it to the postseason I noticed that for playoff teams and other teams alike there is an interesting curve that looks as follows:

In-season managerial changes hovered at around 10% of teams in the first three decades (1900s-1920s) of the 20th century before increasing to around 15% over the next three decades (1930s-1950s) and then exploding to over 20% for the next three decades (1960s-1980s). Since then the frequency of changes has dropped again to around 13%.

Why the upward trend over much of baseball history? I thought at first it might be because it seemed to work but outside of the 1932 and 1938 Cubs who changed managers midstream and caught fire on their way to the pennant, there aren't similar cases until the summer of 1978 when Yankees manager Billy Martin resigned and was replaced by Bob Lemon. And then why the downward trend over the past two decades? Is a new found realization that the influence of managers is circumscribed responsible for the downward trend? Is it the more substantial financial investment in managers by front offices that makes them less willing to pull the plug during the season? Is there something else going on? It's a mystery to me.

Incidentally, over the course of history 15.6% of teams changes manager during the season - 11.9% for teams that reach the post season. I would have thought the latter percentage would have been less but then again it is sometimes the case that a team going nowhere installs a new manager (the 1989 Blue Jays who fired Jimmy Williams after a 12-24 start and hired Cito Gaston who led them to a 77-49 finish comes to mind) and takes off for whatever reason.

Thursday, January 25, 2007

Triple Play

My column this week titled "A Triple Redux" on BP focuses on the relationship between triples and body mass index (BMI) over the history of baseball. The topic was suggested to me by several readers who noted that in the my analysis of triples titled "Baseball's Trifecta" published back in September I looked at several of the theories that may be responsible for the decline in triples including more highly skilled and athletic outfielders, park configurations, risk aversion in a changing run environment, and the aging of the player population.

Several readers pointed out that I didn't look at the changing bulk of players over time as a possible explanation. The long and the short of it is that controlling for the distribution of the changing BMI in the player population does not substantially lessen the steep curve that tracks the decline of triples. Bigger players hit fewer triples and from the 1980s through 2006 the percentage of players with higher BMIs has risen substantially (weight training and the spectre of steroids), but simply not enough to offset the background decline in the triple rate.

What were left with then, is the likelihood that declining triples is a consequence of both the standardization of the game and baseball's ever-increasing level of play.

On the Merits of Probability

Some have called the Dodgers/Padres game on September 18, 2006 "the game of the century" for its unparalleled excitement and finish in the midst of a pennant race. Down 9-5 entering the bottom of the ninth inning the Dodgers Jeff Kent, J.D. Drew, Russell Martin, and Marlon Anderson hit consecutive homeruns off of first Jon Adkins and then Trevor Hoffman to not the game at 9-all. In the top of the 10th the Padres once again struck for a run on a Josh Bard single to take the lead. But in the bottom of the 10th with Kenny Lofton aboard and one out, comeback player of the year Nomar Garciapara sent the Dodgers faithful home with a 11-10 victory powered by his homerun to left field. Oh and it put the Dodgers in first place to boot.

The following graph shows the Win Expectancy (WX) for the Dodgers during the game and highlights some of the key plays along the way. The table that follows includes each play and how that play either increased or decreased the WX for the Dodgers.

                                                                              Score
Inning Outs Batter               Event Text            Start     End    Diff  LA  SD
  1     0   Dave Roberts         43/G                  0.500   0.522   0.022   0   0
  1     1   Brian Giles          K                     0.522   0.538   0.016   0   0
  1     2   Adrian Gonzalez      S8/L                  0.538   0.527  -0.012   0   0
  1     2   Mike Piazza          D8/L.1-H              0.527   0.419  -0.108   0   1
  1     2   Russell Branyan      W                     0.419   0.410  -0.009   0   1
  1     2   Mike Cameron         T9/L.2-H;1-H          0.410   0.245  -0.165   0   3
  1     2   Geoff Blum           S9/L.3-H              0.245   0.187  -0.058   0   4
  1     2   Josh Barfield        8/F                   0.187   0.198   0.011   0   4
  1     0   Rafael Furcal        S5/BG                 0.198   0.227   0.029   0   4
  1     0   Kenny Lofton         S8/G.1-2              0.227   0.276   0.049   0   4
  1     0   Nomar Garciaparra    64(1)3/GDP.2-3        0.276   0.185  -0.091   0   4
  1     2   Jeff Kent            D8/F.3-H              0.185   0.252   0.066   1   4
  1     2   J.D. Drew            K                     0.252   0.223  -0.028   1   4
  2     0   Jake Peavy           K                     0.223   0.237   0.014   1   4
  2     1   Dave Roberts         K                     0.237   0.248   0.010   1   4
  2     2   Brian Giles          S7/G                  0.248   0.240  -0.007   1   4
  2     2   Adrian Gonzalez      K                     0.240   0.254   0.014   1   4
  2     0   Russell Martin       13/G                  0.254   0.232  -0.023   1   4
  2     1   Marlon Anderson      HR/9/F                0.232   0.317   0.086   2   4
  2     1   Wilson Betemit       43/G                  0.317   0.300  -0.017   2   4
  2     2   Brad Penny           K                     0.300   0.289  -0.011   2   4
  3     0   Mike Piazza          53/G                  0.289   0.307   0.018   2   4
  3     1   Russell Branyan      K                     0.307   0.320   0.013   2   4
  3     2   Mike Cameron         S7/G                  0.320   0.311  -0.010   2   4
  3     2   Geoff Blum           CS2(26)               0.311   0.329   0.018   2   4
  3     0   Rafael Furcal        HR/8/F                0.329   0.438   0.109   3   4
  3     0   Kenny Lofton         K                     0.438   0.410  -0.028   3   4
  3     1   Nomar Garciaparra    9/F                   0.410   0.390  -0.020   3   4
  3     2   Jeff Kent            D8/L                  0.390   0.417   0.027   3   4
  3     2   J.D. Drew            DGR/7/L.2-H           0.417   0.538   0.121   4   4
  3     2   Russell Martin       1/L                   0.538   0.500  -0.038   4   4
  4     0   Geoff Blum           6/P                   0.500   0.528   0.028   4   4
  4     1   Josh Barfield        E6/TH/G               0.528   0.498  -0.030   4   4
  4     1   Jake Peavy           4/P                   0.498   0.533   0.035   4   4
  4     2   Dave Roberts         CS2(26)               0.533   0.561   0.028   4   4
  4     0   Marlon Anderson      S9/G                  0.561   0.601   0.041   4   4
  4     0   Wilson Betemit       K+SB2                 0.601   0.583  -0.019   4   4
  4     1   Brad Penny           6/L                   0.583   0.541  -0.041   4   4
  4     2   Rafael Furcal        43/G                  0.541   0.500  -0.041   4   4
  5     0   Dave Roberts         K/B                   0.500   0.530   0.030   4   4
  5     1   Brian Giles          9/F                   0.530   0.553   0.022   4   4
  5     2   Adrian Gonzalez      S6/G                  0.553   0.537  -0.016   4   4
  5     2   Mike Piazza          W.1-2                 0.537   0.509  -0.027   4   4
  5     2   Russell Branyan      W.2-3;1-2             0.509   0.472  -0.037   4   4
  5     2   Mike Cameron         9/F                   0.472   0.567   0.095   4   4
  5     0   Kenny Lofton         K                     0.567   0.537  -0.030   4   4
  5     1   Nomar Garciaparra    D8/L                  0.537   0.592   0.055   4   4
  5     1   Jeff Kent            63/G                  0.592   0.547  -0.046   4   4
  5     2   J.D. Drew            IW                    0.547   0.558   0.011   4   4
  5     2   Russell Martin       5(2)/FO/G.1-2         0.558   0.500  -0.058   4   4
  6     0   Geoff Blum           D9/L                  0.500   0.409  -0.091   4   4
  6     0   Josh Barfield        K                     0.409   0.472   0.063   4   4
  6     1   Terrmel Sledge       43/G.2-3              0.472   0.517   0.045   4   4
  6     2   Dave Roberts         K                     0.517   0.576   0.059   4   4
  6     0   Marlon Anderson      S9/L                  0.576   0.625   0.049   4   4
  6     0   Wilson Betemit       W.1-2                 0.625   0.699   0.073   4   4
  6     0   Oscar Robles         FC1/SAC/BG.2-3;1-2    0.699   0.787   0.088   4   4
  6     0   Rafael Furcal        42(3)/FO/G.2-3;1-2    0.787   0.706  -0.081   4   4
  6     1   Kenny Lofton         12(3)3/GDP            0.706   0.500  -0.206   4   4
  7     0   Brian Giles          E5/G                  0.500   0.443  -0.057   4   4
  7     0   Adrian Gonzalez      3/SAC/BG.1-2          0.443   0.466   0.023   4   4
  7     1   Mike Piazza          IW                    0.466   0.441  -0.025   4   4
  7     1   Josh Bard            54(1)3/GDP            0.441   0.590   0.148   4   4
  7     0   Nomar Garciaparra    7/L                   0.589   0.551  -0.039   4   4
  7     1   Jeff Kent            S8/L                  0.551   0.591   0.040   4   4
  7     1   J.D. Drew            6(1)/FO/G             0.591   0.541  -0.050   4   4
  7     2   Russell Martin       13/G                  0.541   0.500  -0.041   4   4
  8     0   Mike Cameron         9/L                   0.500   0.548   0.048   4   4
  8     1   Geoff Blum           W                     0.548   0.499  -0.049   4   4
  8     1   Josh Barfield        D9/L.1-H;B-3(TH)      0.499   0.197  -0.301   4   5
  8     1   Todd Walker          S8/L.3-H              0.197   0.137  -0.061   4   6
  8     1   Dave Roberts         K+SB2                 0.137   0.146   0.009   4   6
  8     2   Brian Giles          WP.2-3                0.146   0.143  -0.003   4   6
  8     2   Brian Giles          9/F                   0.143   0.167   0.024   4   6
  8     0   Marlon Anderson      T9/L                  0.167   0.300   0.133   4   6
  8     0   Wilson Betemit       S8/G.3-H              0.300   0.405   0.105   5   6
  8     0   Olmedo Saenz         K                     0.405   0.316  -0.089   5   6
  8     1   Rafael Furcal        7/F                   0.316   0.234  -0.082   5   6
  8     2   Kenny Lofton         D9/L.1-3              0.234   0.334   0.100   5   6
  8     2   Nomar Garciaparra    K                     0.334   0.166  -0.168   5   6
  9     0   Adrian Gonzalez      S7/L                  0.166   0.142  -0.024   5   6
  9     0   Manny Alexander      14/SAC/BG.1-2         0.142   0.150   0.008   5   6
  9     1   Josh Bard            D8/F.2-3              0.150   0.104  -0.046   5   6
  9     1   Mike Cameron         IW                    0.104   0.103  -0.001   5   6
  9     1   Geoff Blum           WP.3-H;2-3;1-2        0.103   0.047  -0.055   5   7
  9     1   Geoff Blum           8/SF.3-H;2-3          0.047   0.036  -0.011   5   8
  9     2   Josh Barfield        S9/L.3-H              0.036   0.018  -0.018   5   9
  9     2   Jack Cust            3/G                   0.018   0.019   0.002   5   9
  9     0   Jeff Kent            HR/8/F                0.019   0.043   0.023   6   9
  9     0   J.D. Drew            HR/9/F                0.043   0.094   0.051   7   9
  9     0   Russell Martin       HR/7/F                0.094   0.206   0.112   8   9
  9     0   Marlon Anderson      HR/9/F                0.206   0.642   0.436   9   9
  9     0   Julio Lugo           8/F                   0.642   0.583  -0.059   9   9
  9     1   Andre Ethier         6/P                   0.583   0.536  -0.048   9   9
  9     2   Rafael Furcal        9/F                   0.536   0.500  -0.036   9   9
  10    0   Dave Roberts         8/L                   0.500   0.560   0.060   9   9
  10    1   Brian Giles          D7/L                  0.560   0.442  -0.118   9   9
  10    1   Adrian Gonzalez      IW                    0.442   0.421  -0.021   9   9
  10    1   Paul McAnulty        8/F                   0.421   0.523   0.102   9   9
  10    2   Josh Bard            S9/L.2-H;1-3          0.523   0.167  -0.356   9  10
  10    2   Mike Cameron         W.1-2                 0.167   0.155  -0.012   9  10
  10    2   Geoff Blum           9/F                   0.155   0.206   0.051   9  10
  10    0   Kenny Lofton         W                     0.206   0.338   0.132   9  10
  10    0   Nomar Garciaparra    HR/7/F.1-H            0.338   1.000   0.662  11  10

In the aftermath of that game fellow Baseball Prospectus author Will Carroll and I engaged in a dialogue on the merits and usefulness of measures such as Win Probability Added (WPA) and Win Expectancy Added (WXA). And so for your reading pleasure think of this as a primer on the subject as Will raises legitimate criticisms and throws me a few bones along the way...

[WCarroll] Ok, this WPA thing is beyond me, Dan. I realize that I'm the guy in the group that can't do the complex math, but this reeks of the type of things that statheads get hated for. On the one hand, it reduces "clutch" to mathematical terms, which seems counter to most orthodox analysis and on the other, it makes timing more important than skill. If we look at the amazing Dodgers game this week, Marlon Anderson comes out as the hero. I realize he went five for five with a pair of homers, but why was his homer - the fourth in sequence - any more important than the first one? Jeff Kent's homer came four runs down and started this thing, but he gets almost no credit. And what about the fact the two of the homeruns came off of a superior pitcher in Trevor Hoffman? How does that work?

[DFox] Oh Will. Statheads love to get hated for stuff like this and so I doubt they’ll much sleep over it. But seriously all of the objections you cite are perfectly legitimate. But first, let’s keep in mind that what Win Probability Added (or Win Expectancy Added, which is slightly different although those differences are not important at the moment) is trying to do. At its core it is a technique that’s been around for more than 30 years that simply attempts to quantify how far a player’s actions in a particular game push his team toward a win or a loss. The assignment of those probabilities, and this is the core of all the objections, are based on a matrix that indicates just how probable it is that a team will win given a series of specific game states taking into account outs, inning, score and so on. By crediting (or debiting) the player for a change in the game state we assign them with a certain amount of WPA or WXA (granted, the simplistic way that most folks do this today is to assign all the credit to the pitcher and batter while leaving out the rest of the defense entirely).

Now, because the probabilities used take into account the inning and score primarily, it will always be the case that an event that ties the game or puts the team head will have a much larger magnitude than the same event that occurs in a different context. That’s just the nature of the technique.

[WCarroll] So you're admitting this is flawed? The idea that the timing of a play has as much to do with the sequence is flawed to me. Now, the likelihood of the back times four homers occurring is so low as to be near zero and not worth calculating, it still seems to me that the sequence of events is ignored here. Anderson's homer is a reduced value without Kent's or Martin's and not accounting for that seems to call the technique into question.

[DFox] In the case of Kent versus Anderson Kent’s homerun came at a time when the Dodgers had just a 1.9% chance of winning, being down by four runs in the bottom of the ninth as they were. Although he hit the homerun to make it 4-1 the Dodgers still had just a 4.3% chance of winning and therefore we credit Kent with a 2.3% change or .023 in WXA terms. Anderson on the other hand hit his second homerun when the Dodgers had a 20.6% chance of winning (the two intervening homers raising the odds by just over 5 and 11 percent respectively) and pushed them to 64.2% thereby assigning him a WXA of .436. Of course, Anderson’s homerun wasn’t more important in the big sense of contributing to the win, but it was the event that pushed the Dodgers over the top in terms of their odds of winning the game.

[WCarroll] I can see where the math is going, but Kent's home run is so necessary to the process that it seems he should get more credit than just making a four run game into a three run game with a swing of the bat. The event driven model doesn't account well for the actual nature of the game. Did J.D. Drew’s home run cause the pitching change and if so, where is that factored in?

[DFox] I would certainly agree with you that Kent’s homerun was absolutely necessary to the process. However, it can’t really be argued that immediately after the homerun the Dodgers had a greatly improved chance of winning. They didn’t. They still were down three runs in the bottom of the ninth with no runners on base and the technique credits him appropriately. But this, I think, gets to the heart of one the objections you raised in your initial question regarding two of the homers coming against a tougher pitcher than the other two. The way in which WPA and WXA are calculated do take into account a good portion of the context in which the play was made (and WXA takes in more by including the run environment in a theoretical framework) – but not all of it. While you could conceivably adjust the probability of winning for each batter/pitcher matchup along with a host of other variables including defensive personnel and positioning, weather, tendencies of the manager, and what the batter had for breakfast, the ability to use the technique would drop sharply. In the end these methods provide a model of the game and like all models are imperfect. It’s a bit of a balancing act between greater precision on the one hand and usability on the other.

As to the question of the pitching change anyone watching the game could see that the Drew homerun on the back of the Kent dinger “caused” the pitching change. But again, that is not factored into the equation since the models most folks use don’t take into account differences in batter/pitcher matchups nor the relative strength of the respective team bullpens. For example, although the Dodgers had a 5.1% chance of winning after Drew’s homerun one could argue that with Hoffman still available in the pen their chances were actually smaller than that.

[WCarroll] Everyone's looking for a quantification of clutch and in this analysis, I'm not convinced that the methodology does anything more than make nice graphs and flawed conclusions.

[DFox] But it does allow us to make pretty graphs and that should count for something shouldn’t it? :)

Although originally the Mills Brothers who pioneered this concept (albeit in a slightly different form) in a 1970 book titled Player Win Averages: A Computer Guide to Winning Baseball Players had intended for it to be used to measure clutch ability, this doesn’t do it for the simple reason that it doesn’t correct for the quality, or leverage, of a player’s opportunities. Anderson and Kent did not have the same level of opportunity to accumulate WXA in this game. I think of it more as measuring the total contribution of a player towards winning and losing given the situations they were placed in and so your caution against flawed conclusions is well taken. Of course the more games you aggregate the more the quality of the opportunities tends to even out leveling the playing field (for most hitters anyway, for relief pitchers and especially setup men and closers it’s a different story).

Part of the confusion I think comes from our terms. I’ll admit to having written in the past that we can use WX to quantify "clutch performances" but that is a different thing than “clutch ability”. The former is an acknowledgement that an individual play like Anderson’s second homerun was a very important play and WX can be used to get a feel for how improbable those "clutch" performances are. The latter, however, is about the inherent ability of a player to perform above his normal level when the game is on the line.

To measure clutch ability what analysts have done as you know is look for differences in performance in situations termed clutch and non clutch and see if those differences persist across seasons or careers. What they’ve found in analysis like that in The Book is that there may indeed be a small clutch ability but that ability is basically drowned out by the normal variability inherent in the game. In other words, it may be there but it doesn’t matter much. Now, if it had been shown that there is a wide variance in ability between players in clutch situations then WPA and WXA would be much more useful in measuring that ability all other things (like opportunities given larger sample sizes) being equal. But since that is not the case WXA is obviously not going to be able to capture it.

[WCarroll] Sure. I'm not going to disagree with the theory and I'm certainly not going to critique the math, but what this amounts to is a grand equation that seems to be Game Winning RBI. Back to back to back to back is improbable, sure, but let's look at the situation. What if the hitter before Anderson gets out? Hits a double? How does that change things from this standpoint. In one case, they win in a different, slightly less unique way (or tie that is) and in another, Anderson is penalized for something he has absolutely no control over.

Forget the red herring of clutch. What we have here is a measure of timing, of coincidence, and have disconnected talent from the discussion. Anderson is not a better player than any of the other three. He's been on a hot streak since coming west, but few would argue that he may have been the weakest player on Little's lineup card. For one night, he hits well -- 5 for 5 is nice -- and hits in an interesting way, but he's still the weakest player in the lineup. He just has a better story to tell his grandchildren some day.

[DFox] I think your intuition about WPA or WXA being a fancy form of GWRBI makes for a good analogy. I’m certainly not saying that using WXA for a single game allows you to make a case that Marlon Anderson or Neifi Perez for that matter are actually better players than J.D. Drew or Brandon Inge. And you’re completely on track that if the hitter before Anderson makes an out, or does anything but hit a homerun, it changes the potential WXA for Anderson’s plate appearance. And yes, Anderson has no control over it and as mentioned previously the performance analysis community in general doesn’t think that Anderson has much control over how he performs (relative to his true talent level) when in the situation dictated by the previous hitters in the inning and in the game.

In the end WXA is a measure of timing and coincidence and when taken in very small doses (one very exciting game for example), it is largely disconnected from talent. That’s why when taken over the course of a season or career, if you rank players by their WXA Drew will still beat Anderson and Inge will still beat Perez (actually, almost everyone whose ever played will beat Perez). That said, I believe it’s still an interesting analytical tool when looking not at projecting or evaluating talent, but when looking at the flow of individual games and crediting players over the long haul. For example, I wouldn’t be worried about using seasonal WXA as a tool for input into the discussion of MVPs, Rookie of the Year, or Cy Young awards. In fact, to me, WXA would provide a good mix of who’s the best player and who was the most valuable (leaving questions of team quality aside since WXA doesn’t account for that).

Wednesday, January 24, 2007

SABR and the Humidor

For those in the Denver area there will be a meeting of the Rocky Mountain SABR chapter on Saturday February 10th from 9:30AM until Noon at the Breckenridge Brewery - 220 Blake Street in downtown Denver near Coors Field. This will be the annual "Hot Stove" meeting but the topic of discussion will be the humidor with speakers Walter Sylvester, an Assistant in the Baseball Operations department for the Rockies and Dave Dresen presenting on the topic of "Physics of Baseball at Altitude". Should be a great time and so if you're interested in perhaps joining SABR feel free to come by.

If you're interested in the topic of the humidor you might enjoy these...

Schrodinger's Bat: Swing and Miss

Schrodinger's Bat: More Humidity

Of Humidors and Humidity

Coors Field Fun Facts

Saturday, January 20, 2007

The Power of Squares

Nice article by Dave Studeman over at Baseball Analysts on Pythagoras, run estimation and Bill James. I especially liked the following:

"The power of two is everywhere in life. E=MC squared, after all. When you move closer to a light, cutting the distance in half, the light doesn't become twice as bright...So when Bill James discovered that the nature of runs to winning is squared, it seemed as though something essential and fundamental had been discovered."

Another example of this phenomena is the inverse-square law of gravitation which Newton published in his Principia but which was first hinted at by Ismael Bullialdus and known (or guessed at) in some form to the likes of Christopher Wren, Emond Halley, and Robert Hooke as told in James Gleick's wonderful biography of Isaac Newton titled Isaac Newton.

For more thoughts on run estimation see:

Run Estimation for the Masses
A Closer Look at Run Estimation

Thursday, January 18, 2007

Myths and Excellence

My column this week on Baseball Prospectus, published this morning, is titled "The Myth of the Golden Age" and explores the reasons why, and the demonstration of, an increasing level of play over time. In addition to reviewing the arguments the late Stephen Jay Gould put used in his 1996 book Full House: The Spread of Excellence from Plato to Darwin, I take a quick look at how the hitting of pitchers relative to position players (inspired by the comments of fellow SABR member Stew Thornley) has changed over the course of time and how it is arguably a demonstration of increasing excellence under an evolutionary model.

One of the interesting aspects of that discussion involves how the slope of the relative OPS of pitchers (defined very loosely as players who appears in more than one game as a pitcher in a given year) seems to have changed after World War II. The following two graphs illustrate this change using linear trend lines.

You'll notice that the slope of the line in the first graph is over twice that of the second. This supports nicely the research by Gould on decreasing variation in batting average and Nate Silver's research in Baseball Between the Numbers that shows the game stabilizing after 1940.

Monday, January 15, 2007

Umpire Stats

Also was alerted to this interesting analysis of umpires based on data from BP. This analysis is just for 2006.

In and of itself this doesn't tell us much since you would expect there to be some spread here due to randomness. Nor is it surprising that, for example, some of the same umpires show up in high SO/9 quadrant as well as the high percentage of called strikes quadrant since they are clearly related. It would be interesting to see whether there is any trend that holds over from year to year (for example for Randy Marsh who has a very low % of called balls and a high % of called strikes) and then to quantify the effect if any. The author notes that he is looking into this so stay tuned.

And while the two measures that are shown are clearly related it could be that some umps are more hesitant to ring batters up or make a ball four call and so that could explain why an umpire appears to be a "hitter's ump" in one graph and a "pitcher's ump" in another (again Randy Marsh is an example). Interesting stuff.

Teams and Leagues on the Bases

This week on BP in my column I focused on team baserunning in 2006 where the Angels led the pack with 13.64 theoretical runs picked up on the bases aggregated from EqAAR, EqGAR, EqHAR, and EqSBR. On the opposite end of the spectrum the White Sox were at -21.77 runs and did especially poorly in advancing on hits where almost everybody (save Scott Podsednick and Pablo Ozuna) did terribly.

I also took a look at the differences between the leagues and it appears the NL is indeed the better baserunning league even after controlling for the poor baserunning of both designated hitters in the AL and pitchers in the NL. This result tends to negate the idea that the AL outfields are simply better at depressing runner advancement since NL runners also do better in advancing on the ground.

Thursday, January 11, 2007

The Hook Part II

In a recent post I provided a list of how frequently teams enjoyed the platoon advantage on defense when changing pitchers. Tangotiger asks...

"Your list for pitchers is 62%. Of course, when a closer is brought in, the manager is not looking at the platoon advantage. As well, in blowouts, the manager is not looking for platoon advantages.

Can you break up your list based on whether the pitcher is the "ace" or not, and whether the score is within 4 runs or not?"

Yes.

Below is the same table but this time only when the team's closer (defined as the player who had the most saves in 2006) is not the pitcher being brought in and when the run differential is 3 or fewers runs.


Team   Changes   Adv     Pct
SEA        259   199   76.8%
DET        219   167   76.3%
CHA        217   163   75.1%
SLN        231   166   71.9%
CLE        202   144   71.3%
PIT        332   235   70.8%
TOR        290   204   70.3%
BAL        260   182   70.0%
NYA        264   184   69.7%
CHN        290   200   69.0%
COL        296   201   67.9%
CIN        256   173   67.6%
KCA        275   183   66.5%
MIN        217   144   66.4%
SFN        274   181   66.1%
HOU        274   180   65.7%
ATL        317   208   65.6%
MIL        255   167   65.5%
TEX        253   164   64.8%
PHI        288   185   64.2%
ANA        186   119   64.0%
OAK        247   156   63.2%
SDN        295   183   62.0%
WAS        280   171   61.1%
FLO        230   139   60.4%
BOS        245   147   60.0%
NYN        246   145   58.9%
ARI        286   164   57.3%
TBA        291   166   57.0%
LAN        237   130   54.9%

Total     7812  5150   65.9%

As you can see it doesn't change the numbers that greatly and lifts the aggregate from 62% to 66%. As Tangotiger points out, the offense takes the platoon advantage 78% of the time when pinch hitting and that disparity is to be expected since the pitcher must actually pitch to one hitter and because these numbers include instances where the offensive team then used a pinch hitter, which in many cases gains they do for the express purpose of gaining the advantage. If non-pinch hitting cases are excluded the overall percentage goes up to 69% with Seattle climbing to 82.3% and the Dodgers still at the bottom at 58.8%.

When the game is tighter managers are also more likely to try and gain the platoon advantage (non closer).


Tied   65.5%
1 Run  67.0%
2 Run  65.9%
3 Run  64.6%
4 Run  64.0%
5 Run  62.1%
6 Run  55.3%
>6     53.3%

Perhaps counterintuitively, managers do not seem to display the same tendency based on the inning of the pitching change (given 3 runs or fewer differential and non-closer).


Inning 4     69.4%
Inning 5     69.0%
Inning 6     69.4%
Inning 7     67.9%
Inning 8     66.6%
Inning 9     58.4%
Inning 10+   58.1%

The reason for this is probably due in large part to the fact that the manager has only so many pitchers at his disposal and so his first reliever (likely used in the 6th or 7th innings) will be the most likely to enjoy the advantage. The manager will then be forced to use the pitchers that remain as the game goes later and so can't be so choosy about the matchup he'll get.

Tuesday, January 09, 2007

Ripken and Gwynn

Well, the votes have been cast and both Cal Ripken Jr. and Tony Gwynn enter the Hall of Fame (not unanimously as revealed yesterday). And as expected Mark McGwire received less than a third of the vote based on the "wait and see" approach as applied to him specifically which I applaud. Steve Garvey was in his final year of elligibility and all those below 5% will be removed from the ballot as well.

Hard to believe someone actually cast votes for Jose Canseco, Ken Caminiti, Dante Bichette et. al. but then again...

Jim Rice (64.8% in 2006), Andre Dawson (61%), and Bert Blyleven (53.3%) all fell in the voting although Goose Gossage (64.6% based on last year's induction of Bruce Sutter) has gained strength.

For me, of those on this list and not elected I would like to see Blyleven, Gossage, and McGwire (if nothing further develops in a few years) in the Hall but the HOF is not usually something I get too fired up about either way.


2007 BBWAA Hall of Fame Voting Results
Name            Votes   % of Votes
Cal Ripken Jr.      537     98.5
Tony Gwynn          532     97.6
Rich Gossage        388     71.2
Jim Rice            346     63.5
Andre Dawson        309     56.7
Bert Blyleven       260     47.7
Lee Smith           217     39.8
Jack Morris         202     37.1
Mark McGwire        128     23.5
Tommy John          125     22.9
Steve Garvey        115     21.1
D.Concepcion         74     13.6
Alan Trammell        73     13.4
Dave Parker          62     11.4
Don Mattingly        54      9.9
Dale Murphy          50      9.2
Harold Baines        29      5.3
Orel Hershiser       24      4.4
Albert Belle         19      3.5
Paul O'Neill         12      2.2
Bret Saberhagen       7      1.3
Jose Canseco          6      1.1
Tony Fernandez        4      0.7
Dante Bichette        3      0.6
Eric Davis            3      0.6
Bobby Bonilla         2      0.4
Ken Caminiti          2      0.4
Jay Buhner            1      0.2
Scott Brosius         0      0.0
Wally Joyner          0      0.0
Devon White           0      0.0
Bobby Witt            0      0.0

Monday, January 08, 2007

The Future of Data Collection

In my post on the Year in Review I noted that I'm looking forward to my third season as a stats stringer for MLB.com. To that piece of info Tangotiger asked whether the stringers would be using stopwatches to record data items like hang time in order to more accurately measure batted balls for purposes of defensive evaluation.

Before I had a chance to ask, Tango took matters into his own hands and had an interesting email conversation with the Director of Stats at MLBAM.

One tidbit here, as many have guessed, is that there will likely eventually be a subscription or premium service to get access to this data in a more useable format. In addition, in relation to my column last week on camera angles he had this to say regarding the Enhanced Gameday system used in the 2006 postseason and which I wrote about here.

"As an aside, what’s been amazing to me about this program is what we’ve learned from the data we captured last season. That is, we found out that what we thought we understood about pitch movement has been, for lack of a better word, wrong. Think about how most fans observe pitches: on TV, through the center field camera. However, think about the challenges of accurately judging the pitch this way: you’re trying to follow a 4-inch wide ball from a distance of 400 or more feet, scaled down onto a 27-inch TV screen or 17-inch computer monitor, or whatever your viewing screen might be. And don’t forget that the camera is offset from center by an unknown amount that varies in each ballpark. This creates massive scaling errors in the human mind… for instance, we discovered that in many cases, a pitch that looks like it just missed the black may actually have been 8 to 10 inches outside."

This is fundamentally the reason why other camera angles or even enhanced computer images like Gameday would be wonderful to have. While the centerfield angle may give us the most information about the pitch in real time, that information is not very accurate.

Pitching Change Platoon Advantages

As some readers may be aware the Cubs had a really really bad season on their way to losing 96 games. They gave up 834 runs, good for second worst in the NL in no small part as a result of 60 starts being given to pitchers who had never seen big league time before. By comparison Houston gave 44 starts to newbies, Florida 40, and Tampa Bay 34 while Cincinnati had none at all.

While that may bode well for the future as the Cubs will now have some experience at the AAA level (Sean Marshall pictured on the left and Carlos Marmol particularly) they can draw from, it resulted in 2006 in Dusty Baker making 542 pitching changes, easily outpacing the previous record set by the Giants in 2004.

All of those pitching changes got me to wondering how frequently a manager tries to maintain the platoon advantage when making a pitching change. In other words, while The Bill James Handbook publishes the numbers for how often a manager maintains the platoon advantage when making out his lineups, I've never seen the numbers for pitching changes. I wrote and ran a simple script to count each pitching change and determine when the defense had the platoon advantage. The results are shown in the table below and as you can see the percentage varies from the mid 50s to the low 70s. Obviously, these numbers are heavily influenced by roster construction and the effectiveness of the pitchers the manager has to work with at any given time. It's not surprising to me to see the Cardinals near the top although I would have expected the Rockies to be up there as well as Clint Hurdle seems to like using LOOGYs even when they are manifestly ineffective (Ray King).


Team     Changes PlatoonAdv    Pct
SEA          429        309   72.0%
CHA          398        282   70.9%
DET          390        261   66.9%
CIN          475        317   66.7%
SLN          468        312   66.7%
CHN          542        360   66.4%
NYA          488        314   64.3%
TEX          489        313   64.0%
KCA          473        302   63.8%
BAL          472        300   63.6%
OAK          444        282   63.5%
MIL          427        271   63.5%
PIT          504        319   63.3%
MIN          421        265   62.9%
HOU          497        312   62.8%
SFN          438        271   61.9%
ANA          380        235   61.8%
CLE          376        232   61.7%
ATL          522        321   61.5%
TOR          481        292   60.7%
COL          498        302   60.6%
WAS          515        306   59.4%
SDN          475        276   58.1%
BOS          454        262   57.7%
TBA          444        256   57.7%
FLO          435        249   57.2%
PHI          500        286   57.2%
ARI          461        256   55.5%
LAN          454        252   55.5%
NYN          473        252   53.3%

Thursday, January 04, 2007

Wish List 2007

My column today was a wish list for 2007 and beyond that includes some of my pet peeves like the rigid television broadcasting of the game, doubleheaders, interleague play, "this time it counts", "small-ball wins in the post-season", etc.

For some of the research on baseball and television I used Peter Morris' A Game of Inches: The Stories Behind the Innovations That Shaped Baseball Volume 2: The Game Behind the Scenes published in 2006. I'll have to admit I wasn't aware of the first volume either until I saw this one while perusing a Barnes & Noble over Christmas break. Any baseball fan will want to add both volumes to their library as it contains loads of interesting tidbits on everything from the evolution (not invention) of the pitcher's mound to the first recorded wave on October 15, 1866.

MLB 2K6

I think I mentioned in a blog post earlier this year that I had purchased an XBox 360 and bought a copy of MLB 2K6 as soon as it was available (you can get it now for $19.95). I've now played two full seasons in Franchise mode (using the default settings) as the Cubs and wanted to share the results.

In season one (2006) I finished second in the NL Central behind the Cardinals. Although my team won 93 games the Cardinals won 105 and I was never really in the race. I was hampered by injuries to Todd Walker (60 games) and Ronny Cedeno and underperformance by Derrek Lee and Aramis Ramirez. I was able to flip Greg Maddux and a couple throw-ins for Ben Sheets and J.J. Hardy although Hardy ended up in the minors. I also upgraded by acquiring Wily Mo Pena to play left field. Carlos Zambrano was the brightest spot as he won the Cy Young award.

For the first two-thirds of the season I simulated the games and only set the typical lineups, rotation, and playing time and then let the computer manage each game. I did notice that once I started managing the games myself, the record improved dramatically but too little too late. The computer manager makes poor decisions frequently. When managing the games, however, there is apparently no way to warm up a pitcher when not on defense and so it makes pinch hitting more difficult on occasion. You also don't have the option of setting the defense in manager mode as you do when actually playing the game. You also don't have the option of skipping to the end of the game when it gets out of hand. Another problem which manifested itself in both seasons was that although roster expansion on September 1st is a part of the game, the roster screen wouldn't allow me to bring up any minor leaguers and so I was stuck with 25 players through the end of the season. The trade deadline worked as expected and the options for finding and offering players are pretty decent.

After the season is over you go through a five round draft period and free agent signings before beginning the next season. Players that you haven't extended before the end of the season go into the free agent pool. I've heard from others that there is a bug where the next season will start with a schedule containing only 10 games but I haven't seen that in either of my seasons. The free agent period lasts "10 days" and allows you to make offers and see how the market is progressing. What I've noticed in both trying to re-sign my own players and sign free agents is that when a player says he'll sign with your team for x dollars over y years he's not kidding. I've tried numerous times to offer slightly more money in exchange for fewer years (it seems almost everybody wants a 3-5 year contract) but to no avail. As a result I ended up signing a couple of free agents who would take 2-year contracts. I should have mentioned that before the season you're given a budget with which to work and for the Cubs that was around $75M for the 2006 season. If you reach various milestones you'll receive additional money the following season.

There's also an interesting player morale system whereby each player has a rating that can be boosted by changing his batting order position or adjusting his playing time. Generally it seems that aside from these two variables the aggressiveness settings of the manager also play into how the player feels as well as whether he'll sign with your team as a free agent. You can call team meetings to try and affect the morale but I haven't messed around with this feature that much.

One of the features I liked very much is the Inside Edge scouting reports. In Franchise mode you can use some of your budget to purchase reports for entire teams, hitters, or pitchers and of course they give you a slight advantage. I made sure to purchase them for my primary division rivals and of course its interesting just to look through them since they're based on actual data. Those scouting reports translate into the live action mode as well and when pitching suggest pitches and location and when hitting reveal likely zones for the upcoming pitch.

When I started my second season things started to go haywire. I was able to trade Ben Sheets and Glendon Rusch for Derek Jeter to shore up my hole at shortstop and pick up Scott Kazmir but otherwise started the season with roughly the same roster as the year before. This time, however, the gaming engine seemed to allow my pitchers to dominate at the same time my hitting took off. I was also able to swing a deal for Jason Bay and Craig Miller at mid season by sending Jacque Jones and Michael Barrett to Pittsburgh. The end result of the pitching dominance was that my team went 120-42 with Zambrano and Mark Prior shutting down the rest of the league (throwing three no-hitters between them) and finishing 1-2 in the Cy Young balloting with Prior first this time. Zambrano went 33-4 with a 1.10 ERA and Prior 29-3, 1.05. Both pitched around 300 innings with Prior striking out 481 batters. The third and fourth starters in Jerome Williams and Scott Kazmir both won 19 games with eras in the low 2s and struck out over 200 batters each. On the offensive side Derrek Lee hit 42 homeruns and drove in 137 hitting .327 while Ramirez hit 27 homeruns, Pena 29, and Bay 24. That wasn't the strangest part however. Juan Pierre, who I tried to trade in the offseason, hit .342 with 78 walks and stole 172 bases in 198 attempts. As a result he won the MVP with Lee coming in second. Weird.

In the playoffs I was ousted in the first round by the Mets losing two extra-inning games but did pick up $3M to work with in the following season.

In perusing the other teams statistics it's clear that the rest of the league must have hit something like .240 while my team hit .285. As I mentioned I didn't mess around with any of the settings and wanted to see how the game played out of the box. It'll be interesting to see if the trend continues as I move in the '08 season or if I'll need to start adjusting the settings.

I've not read too many kind things about the game in general (there was a freeze bug that has some workarounds and a patch) but I haven't really been displeased overall. The game did freeze on me initially but after replacing the entire Xbox unit I've not had it lock since. I'll play a live action game occasionally and the game play is decent with the pitching controls and catcher placement being especially well-done. The game does come with some historical teams that can be unlocked and so once I got the cheat codes I was excited to play the 1969 Mets and 1976 Reds. Much to my disappointment the rosters of those teams are populated with no-name players I assume because of licensing issues (the same reason Barry Bonds does not appear in the game). The game could also use more historical stadiums and I've had trouble trying to play in the World Baseball Classic mode which should allow you to play a team all the way through the tournament.

In the final analysis yes the game is a little buggy but I've certainly enjoyed it.

Tuesday, January 30, 2007

Friday, January 26, 2007

Thursday, January 25, 2007

Wednesday, January 24, 2007

Saturday, January 20, 2007

Thursday, January 18, 2007

Monday, January 15, 2007

Thursday, January 11, 2007

Tuesday, January 09, 2007

Monday, January 08, 2007

Thursday, January 04, 2007

Ads

Links

Now on Baseball Prospectus

MLB News From Ballbug

washingtonpost.com - George F. Will -- Washington Post Opinion Writer (washingtonpost.com)

Scriptorium Daily

Blog Archive

Categories

Baseball Links

Baseball Books Reviews

Articles on Other Sites

Best Of...Other Posts

Books and Book Contributions

Baseball Blogs

Other Blogs

Xbox 360

About Me