FREE hit counter and Internet traffic statistics from freestats.com
Showing posts with label General Sabermetrics. Show all posts
Showing posts with label General Sabermetrics. Show all posts

Thursday, January 31, 2008

Yin and Yang

I thought I'd finish off January with a couple of links...

  • Lovin' on Bannister. MLB Trade Rumors did a great interview with Royals pitcher Brian Bannister in three parts. Part 3 is where it gets really good as Brian reveals that he does his own statistical analysis (we already knew he was a BP reader) and gives us his take on DIPs theory. Several well-renowned analysts have already started the discussion into his insights and I'm sure we'll be seeing more in the future. What really encourages me about this is recognizing the value that thoughtful players like Bannister of Jeff Francis can provide and it makes me wonder how teams are utilizing those resources in parterning with the analytical resources they have.


  • Not so Much Insight. Kind of the anti-Bannister kind of observations were offerred by MLB.com reporter Marty Noble back in early January. In a previous article Noble used RBIs per 100 at bats to make a comparison between newly acquired catcher Brian Schneider and departed backstop Paul Lo Duca. In the response I've linked he tries to explain himself and although his main point that Schneider and Lo Duca are no longer as different offensively as some people claim is valid, there's simply no way he can get out of the hole he's dug. He has the right idea, namely that opportunities are important and rate statistics rather than counting stats are key, but of course he fails to select the right kind of opportunities to make the kinds of comparisons he's going for.

    That said, he dropped two little gems that I couldn't pass up:

    Computers have contributed to a current glut of statistics that, to a degree, distort the picture. We have so many now that we lose focus on what is most important. The objective of the game is to win, and to win a team must outscore its opponent. Nothing, therefore, is more important than runs -- both producing and preventing them.

    To what degree and to which statistics is he referring? Actually, I would argue that by translating traditional statistics into the currency of runs assuming an accurrate weighting, the vast majority of the supposed "glut" of statistics (VORP, BaseRuns, Linear Weights, defensive metrics, base running, etc.) have served to paint a more accurrate picture of "what is most important" - creating run differential that leads to winning games.

    That Lo Duca might have had a higher on-base percentage or slugging percentage means less to me than the number of runs he produced. The next time a team wins a game because it produced a higher on-base mark and scored fewer runs than its opponent, please alert me.

    Here I think there are two points of confusion.

    First, it turns out that the very combination of metrics he mentions, on-base percentage and slugging percentage (OPS), is a very strong predictor of runs produced since it accounts for the key ingredients (getting on base, moving runners, and avoiding outs) that are so problematic in looking at things like RBIs per 100 at bats which only measure one part of the equation. Additionally, by not accounting for context nor understanding how other metrics predict offensive output Noble ends up inverting the relationship between offensive production between the statistics he discusses.

    Second, in his last sentence he stumbles across the problem of scale. It is tautological to say that run differential is a perfect predictor of wins and losses at the level of an individual game. Therefore RBIs and run scored (at least for the offense) take on primary significance in that context and at that scale while OBP and SLUG are less predictive. However, once you raise the aggregation level, those counting stats take on less significance in player evaluation because a particular player's role in generating offense is about more than the tallying of the end result (an RBI or run scored) to the point where it quickly becomes the case (and well before the level of seasons) that OBP+SLUG and other derivative metrics are more indicative of offensive contribution and therefore wins and losses.

    This confusion of effects at various scales reminds me (not coincidentally because I'm now reading this book) of one of the primary themes in the writing of the late Stephen Jay Gould. He often railed against the position of ultra selectionists or adaptationists who insisted that natural selection was the exclusive driver and shaper of the pattern of life on earth. Gould contended that evolution operated differently at different levels through various mechanisms and that what worked at one level did not necessarily have power at another. For example, he argues that while natural selection works through differential reproductive success to build adaptations at the level of individual organisms (coloring, wings, claws, size, etc.) those adaptations may have little or nothing to do with survival at the higher level of species. In one of his favorite examples he liked to point out that the small size and adaptability of mammals during the age of the dinosaurs was likely the result of the domination by dinosaurs in the niches available to larger animals. However, when the meteor struck it was those "negative" traits that allowed the mammals to survive but doomed the dinosaurs.
  • Friday, November 16, 2007

    Breaking News!

    No, it's not Barry Bonds. More than the content, the title of this article from the "world wide leader" struck me as particularly funny...

    Players tend to produce less as they enter mid-to-late 30s

    Really? Do you think? Also liked this quote:

    But studies by the Red Sox's baseball operations staff have shown that
    players are at risk for a drop-off in production as they enter their mid-to-late
    30s.

    But more seriously this is a reminder that what you may think is common knowledge may not necessarly be so.

    Tuesday, August 21, 2007

    Picking on Pierre

    I just couldn't let this article on the Dodger's Juan Pierre pass by. It starts off like so:

    What is it about the on-base percentage that a player like Juan Pierre -- who leads the Dodgers in at-bats, runs scored, hits, stolen bases, triples and games played -- gets knocked for not having his higher than .350?

    Pierre has been one of the most consistent players in the Dodgers lineup this season. He plays every day (395 consecutive games, which is the longest active streak in the Majors), makes diving catches in center field on a regular basis and steals second just about every time he gets on base, yet his OBP evidently isn't cutting it.


    Essentially, the author is arguing that acquiring playing time, and thus the opportunity to rack up those counting stats, automatically means you're a good hitter. Omar Moreno, playing in all 162 games for the Pirates in 1980 also led his team in all those categories including walks. In the end though he was 9 runs below average offensively because in addition to accumulating 87 runs scored, 13 triples, and 96 stolen bases, he made 551 outs. Yes, 551. Last year for the Cubs Pierre was also 9 runs below average while playing in all 162 games and made 526 outs. This season he's 8 runs below average and has made almost 400 outs.

    While there is certainly a strong link between playing time and offensive performance and being able to stay on the field is in itself valuable, in Pierre's case the perception of performance is apparently what counts.

    This season, Pierre leads the Dodgers with 147 hits. He is fifth in the NL with 45 multi-hit games, he leads the Majors with 14 sacrifice bunts and he's second in the Majors only to Jose Reyes with 50 stolen bases, and yet his OBP supposedly isn't cutting it.

    Well. Multi-hit games are heavily dependant on playing time, sacrifice bunts are nothing to brag about, and while his 50 stolen bases against only 9 caught stealing is very good, historically he's a break even base stealer at best.

    In 2003 and 2004 with the Marlins Pierre was an above average offensive player to the tune of 10 and 14 runs respectively. In those seasons his OBP was a healthy .361 and .374 (he also had a .378 OBP for the Rockies in 2001 but was still 3 runs below average in the pre-humidor era). The reason of course is that as Pierre himself explains:

    When I'm hitting good, my on-base percentage is high and that's just the way it is. The Dodgers knew that before I came here. It is what it is. I just go out there and play the game, and I don't get caught up in all of this.

    Indeed, in those three season his batting averages were .305, .326, and .327. The problem is that what the Dodgers should have paid attention to is that Pierre hadn't cracked .300 since 2004 and going into his age 29 season it wasn't exactly likely he would revert to his form as a 23 through 26 year-old.

    In order to justify his low OBP the author makes much of his ability to disrupt the pitcher and comes up with this quote from Grady Little.

    He's a disruptive force when he's on base. The other team has to be concerned with him regularly and it disrupts the pitcher.

    Unfortunately for the Dodgers there is little evidence and in fact there is some evidence to the contrary as documented in The Book that "disruptive" baserunners tend to disrupt the batter more than the defense.

    Where the author should have focused perhaps was on Pierre's other contributions on the bases. Since 2000 in my four baserunning metrics he's a positive 27.9 runs making his biggest contribution in advancing on hits to the tune of 18.6 runs. When you add those 27 runs to his total runs above average he comes out 1 run to the good. In other words, offensively over the past almost eight seasons he's been average. Unfortunately, his ledger was heavily stacked in 2003 and 2004 and so in the other six seasons he's been below average.

    On the other side of the coin he's also been a below average defender since 2004 and his lack of arm strength is well known. Contrasted with Omar Moreno, who had a monster year with the glove to the tune of saving 17 run over average in 1980 and who was an above average defender until his latter days with the Yankees, Pierre doesn't stack up very well.

    Don't get me wrong. When Pierre was with the Cubs I enjoyed watching him play and was a little sad to see him go (but not enough to wish the Cubs had signed him at that price tag of course).

    Finally, the author sums up his point by saying...

    Whether his OBP is at .324 or .350, Pierre will continue to do the small things for the Dodgers. He bunts, he steals bases, he legs out triples and robs balls in the outfield, yet he'll constantly be scrutinized because he doesn't get on base enough -- that's just the way it's going to be.

    And that's just the problem. The things he can do are indeed small things and when he doesn't get on base those small things simply aren't enough to compensate for the big things like power which he does not posses.

    He's an exciting player to watch no doubt about it. Just don't pretend that he's a plus offensively when at this point in his career he's clearly not.

    Friday, August 10, 2007

    Ankiel and Bressler

    Back in September of 2005 I wrote a column titled "Rube Bressler Redux?" for The Hardball Times that chronicled the first season of Rick Ankiel's transition from pitcher to hitter. At that time Ankiel had completed a 2005 season that I summarized this way:

    This was the second [his demotion to low-A Quad Cities in late May] stop for Ankiel as he started the year at Double-A Springfield, but did not fare well in his first 60 at-bats, getting off to a 1-for-20 start and hitting around .160 before getting sent down. With the Swing he continued to improve and wound up hitting .270/.368/.514 with 10 doubles and 11 homeruns in 212 plate appearances. His strikeouts were a bit high (37), though he showed a little patience at the plate, collecting 27 walks.

    That good showing in the Midwest League earned him a trip back to Springfield on August 3, and this time it appears he took advantage of it. In the remainder of the season he would hit .300 with 10 homeruns and drive in 28 runs in the 28 games he played, including a 3-for-4 performance with two homeruns and three RBI on the final day of the season. His late season surge even prompted some talk of a September call-up.

    His combined line at Springfield was .243/.295/.515 while overall for the season he hit .259 with 17 doubles, 21 homeruns, and 75 RBIs in 321 at-bats and 85 games. Although he still has a long way to go, I'm sure he and probably the Cardinals viewed this season as a success.

    Late in June Ankiel was asked about making the transition from pitching to the outfield.

    "Not very many people have been successful at it," he replied. "To conquer that quest would be very self-fulfilling."

    Well, after a 2006 season in which he was shelved all year with patellar tendonitis, it appeared that perhaps his window was slipping away. This season, however, he came back at the AAA level and belted 32 homeruns in just over 400 plate appearances before being called up on Thursday and hitting a three-run homer in last night's 5-0 Cardinals victory.

    When I wrote the original article I was interested in how many players had successfully made the transition from full-time pitcher to full-time position player in the history of baseball. It turns out that only five others have ever totaled more than 50 games pitched and 50 games played at other positions in the major leagues. Click on the link above to read about their stories, including that of Rube Bressler for whom the article is titled, but suffice it to say that Ankiel appears as if he'll become the sixth. Whether he goes on to be as successful as any of the others remains to be seen and given his age and plate discipline still seems somewhat remote.

    The interesting thing is that all five of the other players completed their transition before 1940. The final section of the Bressler article details an explanation of just why it is more difficult today to make such transitions than it was in the days of Rube Bressler. Simply put, the argument, first detailed by Stephen J. Gould in an essay discussing the disappearance of the .400 hitter, is that these transitions essentially ended after the war because of the increasing level of play that comes closer to the "right wall" of human ability, coupled with the stabilization of the game itself. In other words, over time baseball players, like other athletes, including sprinters and swimmers, have become better and as the level of play has increased, it has had the side effect of decreasing the variation among players. For players like Bressler and company there was therefore more opportunity to make the transition because good athletes of their ilk could more easily excel beyond the more numerous lesser athletes that populated baseball in the early part of the century.

    The evidence for an increasing level of play was the topic of a column I wrote earlier this year on Baseball Prospectus titled "The Myth of the Golden Age" and in particular one line of evidence fits nicely with Ankiel and Bressler. As described in that column:

    Pitchers are increasingly selected from the amateur ranks based on their extreme right-hand-tail-of-the-distribution excellence in pitching. While there is certainly some athletic and experiential crossover that allows them to hit better than the general population (as evidenced by the best players at early ages being both the best hitters and pitchers), their hitting skill is not selected for in the evolutionary sense and so should remain relatively constant over time. In other words, pitchers simply don't hit as well in the modern game, not because they are not just as skilled (or slightly more so) with the bat as their predecessors, but because the selected skills of all players have increased over time.

    What this all boils down to is that by measuring the relative success of pitchers at the plate we can at the same time, at least indirectly, measure the increasing level of play. The following graph documents that relation using OPS normalized by park and breaks it up by league.



    What I find fascinating about this graph is that it not only shows the increasing difficulty that pitchers have when competing against their peers from the batter's box, it simultaneously gives some information on the relative level of play amongst competing leagues. You'll notice that the American Association and the Union Association of the 1880s and 1890s record higher relative values ostensibly because the leagues were not as difficult. The same applies to the Federal League (1914-1915) and American League relative to the National League from from 1901-1920 and again after integration through the early 1970s. Obviously, after the introduction of the designated hitter in 1973 the league differences can't be measured and so the graph doesn't reflect the subsequent time period. However, if one were to plot those just in the NL, the decline would continue to the point where today the relative production is well under .500.

    All of this has conspired to make Rick Ankiel's story even more compelling and so I for one am going to enjoy it while it lasts.

    Friday, July 20, 2007

    Bunting and Izturis

    My column this week extends last week's discussion of bunting for a hit by examining the strategy behind it using run expectancy. I didn't cover all the nuances I'm sure but wanted to make the article a basic introduction to using the break even formula and how that can apply to a strategy. I also answer a few reader questions along the way.

    By the way, hate to say I told you so but in my August 10th column of last year I noted:

    In picking up Izturis and his $3.2 million contract for this year, $4.25 million for next year [note: the Cubs sent $1.4m to the Pirates in the deal] and club but out for $300,000 in 2008 (that they'll likely be exercising), the Cubs have elevated the mistake they've been making with reserves to not one but two starting positions.

    The core problem is that in Izturis they now have a player who, at his best in 2004, recorded a WARP1 of 3.5. This was when his batting line exceeded his career marks by +.28/+.36/+.43 and he won a Gold Glove. His more typical seasons in 2003 and 2005 were at 2.6 and 2.0, respectively, with his offensive performances actually below the level of a replacement player. In 2005 he ranked 30th of 31 shortstops with 300 or more plate appearances in VORP at -4.2 (Cristian Guzman was 31st with a whopping -14.9)--a trend he has continued thus far in 2006.

    Much of the recent hype surrounding Izturis has been built on the strength of a great April and May of 2005 when he hit .342 and earned an All-Star selection. In other words, the likely outcome of the deal is to extend the search for the next Ricky Gutierrez a year and half while enduring Neifi-like production at shortstop to go with above average defense (The Fielding Bible had Izturis at +10, +19, and +4 in 2003-2005 ranking him 7th, 2nd, and 15th respectively and 6th overall during the time period). Of course that's perhaps even optimistic since Hendry and company may also have repeated their Garciaparra move ("fool me once, shame on me…") by obtaining a player fresh off an injury (degenerative arthritis in his right elbow that required Tommy John last September).

    And of course what it also did was relegate Ronny Cedeno first to second base and then to the minor leagues where he is now hitting .369 and showing some power (.562). I'm not convinced Ryan Theriot is the real deal and would like to see Cedeno get another shot this year.

    Friday, June 22, 2007

    Friday Links

    A few things from this week:

  • Datacaster - A little article on MLBAM datacaster on Chris Johnson who scores for the Dodgers and Angels.


  • State of the BABIP - A great overview and state of the research on batting average on balls in play by Derek Jacques.


  • Changing Course - Joe Sheehan takes a look at the practice of firing managers midseason. I wrote a little about this awhile back in wondering why midseason changes have declined in recent years.

    Be that as it may, thirteen times beginning with the 1932 Cubs, teams have changed managers and gone on to post season play. The last time being the 2004 Astros who replaced Jimmy Williams as team stodd at 44-44. Under Phil Garner they went 48-26 the rest of the way. For Williams it was the second time he had been replaced as the skipper of an eventual postseason team. The first, and the largest difference in terms of winning percentage, was the 1989 Blue Jays. After enduring a 12-24 (.333) start under Jimmy Williams, General Manager Pat Gillick hired Cito Gaston on May 31st as the interim manager. That interim title was quickly forgotten as the Jays reeled off a 77-49 (.611) record with the help of acquisitions Lee Mazzilli and Mookie Wilson from the Mets leading to a 20-9 August that saw them pull into a first place tie with the surprising Orioles as the month closed. After holding a slim lead most of the month of September, the Jays hooked up with the Orioles in a three game series at the new Sky Dome on the season's final weekend with the Orioles one game back. The Blue Jays took the first two games of the series 2-1 and 4-3 to seal the deal.
  • Saturday, June 16, 2007

    Where They Aint Redux

    Responding to Tango's comments, here is the new table with a column added, totals at the bottom, and excluding bunts.


    Year Type BIP H Non-HR TB %H %Non-HR SLUG %Non-HR2
    2003 Fly 36744 9898 5390 27314 26.9% 14.7% 0.743 16.7%
    2004 Fly 37052 10494 5786 28619 28.3% 15.6% 0.772 17.9%
    2005 Fly 37268 10442 5913 28207 28.0% 15.9% 0.757 18.1%
    2006 Fly 37712 10863 6034 29557 28.8% 16.0% 0.784 18.3%
    ----------------------------------------------------------------------------------
    2003 Ground 60783 14355 14353 15687 23.6% 23.6% 0.258 23.6%
    2004 Ground 60212 14267 14267 15623 23.7% 23.7% 0.259 23.7%
    2005 Ground 60373 14092 14092 15388 23.3% 23.3% 0.255 23.3%
    2006 Ground 59912 14367 14367 15690 24.0% 24.0% 0.262 24.0%
    ----------------------------------------------------------------------------------
    2003 Line 25846 18985 18289 26505 73.5% 70.8% 1.025 72.7%
    2004 Line 25663 18951 18208 26495 73.8% 71.0% 1.032 73.1%
    2005 Line 25425 18649 18162 25240 73.3% 71.4% 0.993 72.8%
    2006 Line 25902 19012 18456 26072 73.4% 71.3% 1.007 72.8%
    ----------------------------------------------------------------------------------
    2003 Pop 10853 168 168 207 1.5% 1.5% 0.019 1.5%
    2004 Pop 11007 226 226 268 2.1% 2.1% 0.024 2.1%
    2005 Pop 11123 223 223 258 2.0% 2.0% 0.023 2.0%
    2006 Pop 10656 238 238 309 2.2% 2.2% 0.029 2.2%
    ----------------------------------------------------------------------------------
    2003 30639 15 14 18 0.0% 0.0% 0.001 0.0%
    2004 31657 0 0 0 0.0% 0.0% 0.000 0.0%
    2005 30463 1 0 4 0.0% 0.0% 0.000 0.0%
    2006 31558 47 47 93 0.1% 0.1% 0.003 0.1%
    ----------------------------------------------------------------------------------
    Totals 660848 175293 154233 281554 26.5% 23.3% 0.426 24.1%

    Friday, June 15, 2007

    Hit 'em Where They Aint

    I was running a few numbers for a colleague and so I thought I'd post them. Below is a table going back to 2003 that shows the number of balls put into play by vector and the numbers and percentages that went for hits and non-homerun hits. Finally we have the slugging percentage earned on those batted balls.


    Year Type BIP H Non-HR TB %H %Non-HR SLUG
    2003 Fly 36870 9898 5390 27314 26.8% 14.6% 0.741
    2004 Fly 37168 10494 5786 28619 28.2% 15.6% 0.770
    2005 Fly 37268 10442 5913 28207 28.0% 15.9% 0.757
    2006 Fly 37856 10863 6034 29557 28.7% 15.9% 0.781
    ------------------------------------------------------------------------
    2003 Ground 62253 14979 14977 16312 24.1% 24.1% 0.262
    2004 Ground 61577 14841 14841 16198 24.1% 24.1% 0.263
    2005 Ground 61809 14669 14669 15965 23.7% 23.7% 0.258
    2006 Ground 61131 14910 14910 16233 24.4% 24.4% 0.266
    ------------------------------------------------------------------------
    2003 Line 25852 18986 18290 26506 73.4% 70.7% 1.025
    2004 Line 25666 18951 18208 26495 73.8% 70.9% 1.032
    2005 Line 25425 18649 18162 25240 73.3% 71.4% 0.993
    2006 Line 25909 19013 18457 26073 73.4% 71.2% 1.006
    ------------------------------------------------------------------------
    2003 Pop 11124 173 173 212 1.6% 1.6% 0.019
    2004 Pop 11285 236 236 278 2.1% 2.1% 0.025
    2005 Pop 11370 230 230 265 2.0% 2.0% 0.023
    2006 Pop 10886 239 239 310 2.2% 2.2% 0.028
    ------------------------------------------------------------------------
    2003 30639 15 14 18 0.0% 0.0% 0.001
    2004 31657 0 0 0 0.0% 0.0% 0.000
    2005 30463 1 0 4 0.0% 0.0% 0.000
    2006 31558 47 47 93 0.1% 0.1% 0.003


    So from this you can see that just under 30% of fly balls go for hits although only 16% of non-homerun flyballs. About a quarter of ground balls and three quarters of fly balls fall in and 1 in 50 popups are hits. The remainder are not actually balls in place but rather strikeouts and unclassified plays in the data.

    Monday, May 28, 2007

    The 100 RBI Men

    A reader makes the following observation:

    "Carlos Delgado is currently on pace to get 100 RBIs for the Mets this year, despite sporting an abysmal .234 BA, 306 OBA and 359 SL%. This helps to make the case that RBIs are as much a team stat as an individual one, as the fellas hitting in front of Delgado (Jose Reyes, Carlos Beltran, and David Wright) are pretty adept at getting on base, which allows Delgado to make outs 7 times out of ten and still accumulate a decent number of RBIs. Simply put, he is getting a boatload of opportunities. Delgado also is lucky enough to bat cleanup on a team that is doing very well in the standings, so the manager is unlikely to move him out of the cleanup slot anytime soon."

    Well, assuming Delgago plays in 144 games this season as he did in both 2005 and 2006 he'll wind up with 91 RBI and won't break the 100 mark. And as of tonight Delgado's OPS is 665 and taking 2006 league norms for league OPS and the park effect of Shea Stadium that means that Delgado is on pace to have a league normalized and park adjusted OPS of 92. In any case, that doesn't directly bear on the question which follows...

    "So, anyways, my question: Who are the worst hitters in MLB history to get 100 RBIs in a season in MLB? And what are their stories? Were these players in similar situations to Delgado, or do they have some other tale to tell?"

    To answer the first part and take a crack at the second, here are the "top" 50 players with 100 or more RBI in a single season with the lowest normalized and park adjusted OPS. There are 1,543 players with 100 or more RBI in a single season since beginning in 1901.


    Name Year PA G RBI OPS NOPS/PF
    Joe Carter 1997 668 157 102 683 89
    Vinny Castilla 1999 674 158 102 809 92
    Ruben Sierra 1993 692 158 101 678 93
    Tony Armas 1983 613 145 107 707 94
    Paul O'Neill 2000 628 142 100 760 94
    Ray Pepper 1934 598 148 101 732 94
    Marv Owen 1936 655 154 105 750 96
    Joe Carter 1996 682 157 107 782 96
    Glenn Wright 1927 626 143 105 716 96
    Joe Carter 1990 697 162 115 681 97
    Travis Fryman 1996 688 157 100 766 97
    Joe Randa 2000 665 158 106 781 97
    Jeff Francoeur 2006 686 162 103 742 98
    Jeff Cirillo 2000 684 157 115 869 98
    Tony Batista 2004 650 157 110 728 98
    Torii Hunter 2003 642 154 102 762 99
    Ray Jablonski 1953 640 157 112 735 99
    Joe Pepitone 1964 647 160 100 698 100
    Ernie Banks 1969 629 155 106 725 100
    Carlos Beltran 1999 723 156 108 791 100
    Bill Buckner 1986 681 153 102 733 100
    Bill Brubaker 1936 620 145 102 736 101
    George Bell 1992 670 155 112 712 101
    Garret Anderson 2001 704 161 123 792 101
    Travis Fryman 1997 657 154 102 766 101
    Andres Galarraga 1995 604 143 106 842 101
    Chili Davis 1993 645 153 112 767 101
    Eddie Robinson 1953 685 156 102 735 101
    Wally Pipp 1923 634 144 108 749 101
    Willie McGee 1987 652 153 105 746 101
    Bing Miller 1930 654 154 100 795 101
    Pinky Higgins 1938 603 139 106 794 101
    Butch Hobson 1977 637 159 112 789 101
    Andruw Jones 2001 693 161 104 772 101
    George Kelly 1929 632 147 103 760 101
    Gee Walker 1939 645 149 111 773 101
    Moose Solters 1936 676 152 134 802 101
    Ed Sprague 1996 670 159 101 821 101
    Al Simmons 1924 644 152 102 774 102
    Ruben Sierra 1987 696 158 109 771 102
    Billy Rogell 1934 679 154 100 766 102
    Richie Sexson 1999 525 134 116 818 102
    Vernon Wells 2002 648 159 100 762 102
    Pinky Whitney 1930 662 149 117 849 102
    Pinky Whitney 1928 636 151 103 768 102
    Matt Williams 1997 636 151 105 795 102
    Glenn Wright 1924 662 153 111 744 102


    I know many of you had an inkling that Joe Carter would take the top spot. He also appears at numbers 8 and 10 (and number 53 for his 1987, 104 for his 1989 season, 110 for his 1993 season, and number 146 for his 1994 season...you get the idea). But given the poor light in which RBI have been cast in recent years, perhaps surprisingly only 17 times (a little over 1%) has a player ever driven in 100 runs while not being at least league average. So while getting to 100 RBI doesn't ensure that the hitter is an elite offensive performer, it is a pretty good proxy. Put in another way, one needn't be a great hitter to accrue 100 RBI but great hitters often get to 100 RBI. And so in the absence of better metrics it's not surprising that 100 RBI became shorthand for a great offensive performance. This is illustrated by the fact that the "average" 100 RBI man had NOPS/PF of 125 and the histogram belows which shows their distribution:



    Contributing to the idea that RBI equals greatness is the ongoing debate over the significance and prevalence of clutch hitting. A player with alot of RBI is often automatically assumed to be a clutch performer as Joe Carter was.

    That said, given that we now have much more granular means (with OPS actually being on the lower end) of estimating the run contribution of individual hitters, that usage should wane some although it may take generational turnover to bring about its demise. For a little deeper perspective on traditional and more modern methods of gauging a player's contribution see chapter 1 of Baseball Between the Numbers.

    But in getting back to the question at hand, perusing the list you see a few factors that certainly play into reaching the century mark:

  • Performance - As mentioned above there is no doubt that in large part getting to 100 RBI requires a strong performance. From the graph above (the pink cumulative line that uses the y-axis on the right) you can see that fully 75% of those who have driven in 100 runs were 15% or more above league average and 60% were 25% or more above average.


  • Park - Vinny Castilla and Jeff Cirillo in the top 20 show that playing in a park where lots of runs are scored certainly helps, and of course by adjusting for park we don't give them any benefit


  • Era - Twelve of the top 20 players either played in the 1930s or since 1993 which were the two highest scoring eras in modern baseball history. Just like playing in a park where runs are more plentiful allows lesser hitters to drive in more runs, playing in an expanding offensive environment devalues the 100-RBI mark.


  • Teammates - Certainly the reader makes a good point about teammates having to be on the bases. You could probably make a case that Paul O'Neil in 2000 with the Yankees, Marv Owen for the Tigers in 1936, Glenn Wright with the Pirates in 1927, and Bill Buckner with the 1986 Red Sox all fall into this category where the individual was part of a strong offensive team from top to bottom.


  • Lineup Position - It probably comes as no surprise that many of the players on this list and that accrue 100 RBI generally are middle-of-the-order hitters. It probably comes as a bit more of surprise that, as shown in the graph below, the number three position in order actually hits with relatively fewer runners on base than does any other lineups positions save the leadoff and second spots in the order.



  • Plate Appearances - More generally the latter two contributing factors as well as this one fall into the category of opportunity. A player has to come to the plate often enough to reach the 100 RBI mark. Probably no one in this list better exemplifies these is Ruben Sierra's 1993 performance with the Oakland A's. In that season Rickey Henderson hit leadoff for half the year and he hit third in an AL lineup which increases the opportunities for the third hitter and racked up almost 700 plate appearances. On the average, the 100 RBI men had 651 plate appearances. Yes Rudy York did drive in 103 runs in just 417 plate appearances for the 1937 Tigers but all told just 213 players (14%) have ever driven in 100 runs while not coming to the plate at least 600 times.


  • So will Carlos Delgado get to 100 RBI and if so what does it mean? Speaking only in generalties and knowing only that about this performance, we'd have to guess that he was a pretty good hitter. However, that doesn't completely rule out the possibility that his park, era, teammates, lineup position, and playing time all conspired to his breaking the 100 RBI barrier.

    Wednesday, April 11, 2007

    Assignment Discovery: Sabermetrics

    I was alerted by a fellow Cubs fan that the program "Statistics and Data Analysis in Sports" will be airing on the Discovery Channel on April 17th. The description of the show on their web site says:

    Using only a calculator, a stat book, and some custom equations, a new generation of baseball statisticians believes that it's possible to predict a player's true value to his team. The results will surprise you.
    It'll be interesting to see if they're really talking about "prediction" or simply quantification after the fact. The former has its limits while the latter is very well understood. I'm also interested in these types of presentations since they often misrepresent and distort subjects that are somewhat technical. I wrote about two depictions of sabermetrics back in November in a column titled "The Numb3rs Game" on BP.

    Saturday, January 20, 2007

    The Power of Squares

    Nice article by Dave Studeman over at Baseball Analysts on Pythagoras, run estimation and Bill James. I especially liked the following:

    "The power of two is everywhere in life. E=MC squared, after all. When you move closer to a light, cutting the distance in half, the light doesn't become twice as bright...So when Bill James discovered that the nature of runs to winning is squared, it seemed as though something essential and fundamental had been discovered."

    Another example of this phenomena is the inverse-square law of gravitation which Newton published in his Principia but which was first hinted at by Ismael Bullialdus and known (or guessed at) in some form to the likes of Christopher Wren, Emond Halley, and Robert Hooke as told in James Gleick's wonderful biography of Isaac Newton titled Isaac Newton.

    For more thoughts on run estimation see:

    Run Estimation for the Masses
    A Closer Look at Run Estimation