Dan Agonistes: General Sabermetrics

Showing posts with label General Sabermetrics. Show all posts

Thursday, January 31, 2008

Yin and Yang

I thought I'd finish off January with a couple of links...

Lovin' on Bannister. MLB Trade Rumors did a great interview with Royals pitcher Brian Bannister in three parts. Part 3 is where it gets really good as Brian reveals that he does his own statistical analysis (we already knew he was a BP reader) and gives us his take on DIPs theory. Several well-renowned analysts have already started the discussion into his insights and I'm sure we'll be seeing more in the future. What really encourages me about this is recognizing the value that thoughtful players like Bannister of Jeff Francis can provide and it makes me wonder how teams are utilizing those resources in parterning with the analytical resources they have.

Not so Much Insight. Kind of the anti-Bannister kind of observations were offerred by MLB.com reporter Marty Noble back in early January. In a previous article Noble used RBIs per 100 at bats to make a comparison between newly acquired catcher Brian Schneider and departed backstop Paul Lo Duca. In the response I've linked he tries to explain himself and although his main point that Schneider and Lo Duca are no longer as different offensively as some people claim is valid, there's simply no way he can get out of the hole he's dug. He has the right idea, namely that opportunities are important and rate statistics rather than counting stats are key, but of course he fails to select the right kind of opportunities to make the kinds of comparisons he's going for.

That said, he dropped two little gems that I couldn't pass up:

Computers have contributed to a current glut of statistics that, to a degree, distort the picture. We have so many now that we lose focus on what is most important. The objective of the game is to win, and to win a team must outscore its opponent. Nothing, therefore, is more important than runs -- both producing and preventing them.

To what degree and to which statistics is he referring? Actually, I would argue that by translating traditional statistics into the currency of runs assuming an accurrate weighting, the vast majority of the supposed "glut" of statistics (VORP, BaseRuns, Linear Weights, defensive metrics, base running, etc.) have served to paint a more accurrate picture of "what is most important" - creating run differential that leads to winning games.

That Lo Duca might have had a higher on-base percentage or slugging percentage means less to me than the number of runs he produced. The next time a team wins a game because it produced a higher on-base mark and scored fewer runs than its opponent, please alert me.

Here I think there are two points of confusion.

First, it turns out that the very combination of metrics he mentions, on-base percentage and slugging percentage (OPS), is a very strong predictor of runs produced since it accounts for the key ingredients (getting on base, moving runners, and avoiding outs) that are so problematic in looking at things like RBIs per 100 at bats which only measure one part of the equation. Additionally, by not accounting for context nor understanding how other metrics predict offensive output Noble ends up inverting the relationship between offensive production between the statistics he discusses.

Second, in his last sentence he stumbles across the problem of scale. It is tautological to say that run differential is a perfect predictor of wins and losses at the level of an individual game. Therefore RBIs and run scored (at least for the offense) take on primary significance in that context and at that scale while OBP and SLUG are less predictive. However, once you raise the aggregation level, those counting stats take on less significance in player evaluation because a particular player's role in generating offense is about more than the tallying of the end result (an RBI or run scored) to the point where it quickly becomes the case (and well before the level of seasons) that OBP+SLUG and other derivative metrics are more indicative of offensive contribution and therefore wins and losses.

This confusion of effects at various scales reminds me (not coincidentally because I'm now reading this book) of one of the primary themes in the writing of the late Stephen Jay Gould. He often railed against the position of ultra selectionists or adaptationists who insisted that natural selection was the exclusive driver and shaper of the pattern of life on earth. Gould contended that evolution operated differently at different levels through various mechanisms and that what worked at one level did not necessarily have power at another. For example, he argues that while natural selection works through differential reproductive success to build adaptations at the level of individual organisms (coloring, wings, claws, size, etc.) those adaptations may have little or nothing to do with survival at the higher level of species. In one of his favorite examples he liked to point out that the small size and adaptability of mammals during the age of the dinosaurs was likely the result of the domination by dinosaurs in the niches available to larger animals. However, when the meteor struck it was those "negative" traits that allowed the mammals to survive but doomed the dinosaurs.

Friday, November 16, 2007

Breaking News!

No, it's not Barry Bonds. More than the content, the title of this article from the "world wide leader" struck me as particularly funny...

Players tend to produce less as they enter mid-to-late 30s

Really? Do you think? Also liked this quote:

But studies by the Red Sox's baseball operations staff have shown that
players are at risk for a drop-off in production as they enter their mid-to-late
30s.

But more seriously this is a reminder that what you may think is common knowledge may not necessarly be so.

Tuesday, August 21, 2007

Picking on Pierre

I just couldn't let this article on the Dodger's Juan Pierre pass by. It starts off like so:

What is it about the on-base percentage that a player like Juan Pierre -- who leads the Dodgers in at-bats, runs scored, hits, stolen bases, triples and games played -- gets knocked for not having his higher than .350?

Pierre has been one of the most consistent players in the Dodgers lineup this season. He plays every day (395 consecutive games, which is the longest active streak in the Majors), makes diving catches in center field on a regular basis and steals second just about every time he gets on base, yet his OBP evidently isn't cutting it.

Essentially, the author is arguing that acquiring playing time, and thus the opportunity to rack up those counting stats, automatically means you're a good hitter. Omar Moreno, playing in all 162 games for the Pirates in 1980 also led his team in all those categories including walks. In the end though he was 9 runs below average offensively because in addition to accumulating 87 runs scored, 13 triples, and 96 stolen bases, he made 551 outs. Yes, 551. Last year for the Cubs Pierre was also 9 runs below average while playing in all 162 games and made 526 outs. This season he's 8 runs below average and has made almost 400 outs.

While there is certainly a strong link between playing time and offensive performance and being able to stay on the field is in itself valuable, in Pierre's case the perception of performance is apparently what counts.

This season, Pierre leads the Dodgers with 147 hits. He is fifth in the NL with 45 multi-hit games, he leads the Majors with 14 sacrifice bunts and he's second in the Majors only to Jose Reyes with 50 stolen bases, and yet his OBP supposedly isn't cutting it.

Well. Multi-hit games are heavily dependant on playing time, sacrifice bunts are nothing to brag about, and while his 50 stolen bases against only 9 caught stealing is very good, historically he's a break even base stealer at best.

In 2003 and 2004 with the Marlins Pierre was an above average offensive player to the tune of 10 and 14 runs respectively. In those seasons his OBP was a healthy .361 and .374 (he also had a .378 OBP for the Rockies in 2001 but was still 3 runs below average in the pre-humidor era). The reason of course is that as Pierre himself explains:

When I'm hitting good, my on-base percentage is high and that's just the way it is. The Dodgers knew that before I came here. It is what it is. I just go out there and play the game, and I don't get caught up in all of this.

Indeed, in those three season his batting averages were .305, .326, and .327. The problem is that what the Dodgers should have paid attention to is that Pierre hadn't cracked .300 since 2004 and going into his age 29 season it wasn't exactly likely he would revert to his form as a 23 through 26 year-old.

In order to justify his low OBP the author makes much of his ability to disrupt the pitcher and comes up with this quote from Grady Little.

He's a disruptive force when he's on base. The other team has to be concerned with him regularly and it disrupts the pitcher.

Unfortunately for the Dodgers there is little evidence and in fact there is some evidence to the contrary as documented in The Book that "disruptive" baserunners tend to disrupt the batter more than the defense.

Where the author should have focused perhaps was on Pierre's other contributions on the bases. Since 2000 in my four baserunning metrics he's a positive 27.9 runs making his biggest contribution in advancing on hits to the tune of 18.6 runs. When you add those 27 runs to his total runs above average he comes out 1 run to the good. In other words, offensively over the past almost eight seasons he's been average. Unfortunately, his ledger was heavily stacked in 2003 and 2004 and so in the other six seasons he's been below average.

On the other side of the coin he's also been a below average defender since 2004 and his lack of arm strength is well known. Contrasted with Omar Moreno, who had a monster year with the glove to the tune of saving 17 run over average in 1980 and who was an above average defender until his latter days with the Yankees, Pierre doesn't stack up very well.

Don't get me wrong. When Pierre was with the Cubs I enjoyed watching him play and was a little sad to see him go (but not enough to wish the Cubs had signed him at that price tag of course).

Finally, the author sums up his point by saying...

Whether his OBP is at .324 or .350, Pierre will continue to do the small things for the Dodgers. He bunts, he steals bases, he legs out triples and robs balls in the outfield, yet he'll constantly be scrutinized because he doesn't get on base enough -- that's just the way it's going to be.

And that's just the problem. The things he can do are indeed small things and when he doesn't get on base those small things simply aren't enough to compensate for the big things like power which he does not posses.

He's an exciting player to watch no doubt about it. Just don't pretend that he's a plus offensively when at this point in his career he's clearly not.

Friday, August 10, 2007

Ankiel and Bressler

Back in September of 2005 I wrote a column titled "Rube Bressler Redux?" for The Hardball Times that chronicled the first season of Rick Ankiel's transition from pitcher to hitter. At that time Ankiel had completed a 2005 season that I summarized this way:

This was the second [his demotion to low-A Quad Cities in late May] stop for Ankiel as he started the year at Double-A Springfield, but did not fare well in his first 60 at-bats, getting off to a 1-for-20 start and hitting around .160 before getting sent down. With the Swing he continued to improve and wound up hitting .270/.368/.514 with 10 doubles and 11 homeruns in 212 plate appearances. His strikeouts were a bit high (37), though he showed a little patience at the plate, collecting 27 walks.

That good showing in the Midwest League earned him a trip back to Springfield on August 3, and this time it appears he took advantage of it. In the remainder of the season he would hit .300 with 10 homeruns and drive in 28 runs in the 28 games he played, including a 3-for-4 performance with two homeruns and three RBI on the final day of the season. His late season surge even prompted some talk of a September call-up.

His combined line at Springfield was .243/.295/.515 while overall for the season he hit .259 with 17 doubles, 21 homeruns, and 75 RBIs in 321 at-bats and 85 games. Although he still has a long way to go, I'm sure he and probably the Cardinals viewed this season as a success.

Late in June Ankiel was asked about making the transition from pitching to the outfield.

"Not very many people have been successful at it," he replied. "To conquer that quest would be very self-fulfilling."

Well, after a 2006 season in which he was shelved all year with patellar tendonitis, it appeared that perhaps his window was slipping away. This season, however, he came back at the AAA level and belted 32 homeruns in just over 400 plate appearances before being called up on Thursday and hitting a three-run homer in last night's 5-0 Cardinals victory.

When I wrote the original article I was interested in how many players had successfully made the transition from full-time pitcher to full-time position player in the history of baseball. It turns out that only five others have ever totaled more than 50 games pitched and 50 games played at other positions in the major leagues. Click on the link above to read about their stories, including that of Rube Bressler for whom the article is titled, but suffice it to say that Ankiel appears as if he'll become the sixth. Whether he goes on to be as successful as any of the others remains to be seen and given his age and plate discipline still seems somewhat remote.

The interesting thing is that all five of the other players completed their transition before 1940. The final section of the Bressler article details an explanation of just why it is more difficult today to make such transitions than it was in the days of Rube Bressler. Simply put, the argument, first detailed by Stephen J. Gould in an essay discussing the disappearance of the .400 hitter, is that these transitions essentially ended after the war because of the increasing level of play that comes closer to the "right wall" of human ability, coupled with the stabilization of the game itself. In other words, over time baseball players, like other athletes, including sprinters and swimmers, have become better and as the level of play has increased, it has had the side effect of decreasing the variation among players. For players like Bressler and company there was therefore more opportunity to make the transition because good athletes of their ilk could more easily excel beyond the more numerous lesser athletes that populated baseball in the early part of the century.

The evidence for an increasing level of play was the topic of a column I wrote earlier this year on Baseball Prospectus titled "The Myth of the Golden Age" and in particular one line of evidence fits nicely with Ankiel and Bressler. As described in that column:

Pitchers are increasingly selected from the amateur ranks based on their extreme right-hand-tail-of-the-distribution excellence in pitching. While there is certainly some athletic and experiential crossover that allows them to hit better than the general population (as evidenced by the best players at early ages being both the best hitters and pitchers), their hitting skill is not selected for in the evolutionary sense and so should remain relatively constant over time. In other words, pitchers simply don't hit as well in the modern game, not because they are not just as skilled (or slightly more so) with the bat as their predecessors, but because the selected skills of all players have increased over time.

What this all boils down to is that by measuring the relative success of pitchers at the plate we can at the same time, at least indirectly, measure the increasing level of play. The following graph documents that relation using OPS normalized by park and breaks it up by league.

What I find fascinating about this graph is that it not only shows the increasing difficulty that pitchers have when competing against their peers from the batter's box, it simultaneously gives some information on the relative level of play amongst competing leagues. You'll notice that the American Association and the Union Association of the 1880s and 1890s record higher relative values ostensibly because the leagues were not as difficult. The same applies to the Federal League (1914-1915) and American League relative to the National League from from 1901-1920 and again after integration through the early 1970s. Obviously, after the introduction of the designated hitter in 1973 the league differences can't be measured and so the graph doesn't reflect the subsequent time period. However, if one were to plot those just in the NL, the decline would continue to the point where today the relative production is well under .500.

All of this has conspired to make Rick Ankiel's story even more compelling and so I for one am going to enjoy it while it lasts.

Friday, July 20, 2007

Bunting and Izturis

My column this week extends last week's discussion of bunting for a hit by examining the strategy behind it using run expectancy. I didn't cover all the nuances I'm sure but wanted to make the article a basic introduction to using the break even formula and how that can apply to a strategy. I also answer a few reader questions along the way.

By the way, hate to say I told you so but in my August 10th column of last year I noted:

In picking up Izturis and his $3.2 million contract for this year, $4.25 million for next year [note: the Cubs sent $1.4m to the Pirates in the deal] and club but out for $300,000 in 2008 (that they'll likely be exercising), the Cubs have elevated the mistake they've been making with reserves to not one but two starting positions.

The core problem is that in Izturis they now have a player who, at his best in 2004, recorded a WARP1 of 3.5. This was when his batting line exceeded his career marks by +.28/+.36/+.43 and he won a Gold Glove. His more typical seasons in 2003 and 2005 were at 2.6 and 2.0, respectively, with his offensive performances actually below the level of a replacement player. In 2005 he ranked 30th of 31 shortstops with 300 or more plate appearances in VORP at -4.2 (Cristian Guzman was 31st with a whopping -14.9)--a trend he has continued thus far in 2006.

Much of the recent hype surrounding Izturis has been built on the strength of a great April and May of 2005 when he hit .342 and earned an All-Star selection. In other words, the likely outcome of the deal is to extend the search for the next Ricky Gutierrez a year and half while enduring Neifi-like production at shortstop to go with above average defense (The Fielding Bible had Izturis at +10, +19, and +4 in 2003-2005 ranking him 7th, 2nd, and 15th respectively and 6th overall during the time period). Of course that's perhaps even optimistic since Hendry and company may also have repeated their Garciaparra move ("fool me once, shame on me…") by obtaining a player fresh off an injury (degenerative arthritis in his right elbow that required Tommy John last September).

And of course what it also did was relegate Ronny Cedeno first to second base and then to the minor leagues where he is now hitting .369 and showing some power (.562). I'm not convinced Ryan Theriot is the real deal and would like to see Cedeno get another shot this year.

Friday, June 22, 2007

Friday Links

A few things from this week:

Datacaster - A little article on MLBAM datacaster on Chris Johnson who scores for the Dodgers and Angels.

State of the BABIP - A great overview and state of the research on batting average on balls in play by Derek Jacques.

Changing Course - Joe Sheehan takes a look at the practice of firing managers midseason. I wrote a little about this awhile back in wondering why midseason changes have declined in recent years.

Be that as it may, thirteen times beginning with the 1932 Cubs, teams have changed managers and gone on to post season play. The last time being the 2004 Astros who replaced Jimmy Williams as team stodd at 44-44. Under Phil Garner they went 48-26 the rest of the way. For Williams it was the second time he had been replaced as the skipper of an eventual postseason team. The first, and the largest difference in terms of winning percentage, was the 1989 Blue Jays. After enduring a 12-24 (.333) start under Jimmy Williams, General Manager Pat Gillick hired Cito Gaston on May 31st as the interim manager. That interim title was quickly forgotten as the Jays reeled off a 77-49 (.611) record with the help of acquisitions Lee Mazzilli and Mookie Wilson from the Mets leading to a 20-9 August that saw them pull into a first place tie with the surprising Orioles as the month closed. After holding a slim lead most of the month of September, the Jays hooked up with the Orioles in a three game series at the new Sky Dome on the season's final weekend with the Orioles one game back. The Blue Jays took the first two games of the series 2-1 and 4-3 to seal the deal.

Saturday, June 16, 2007

Where They Aint Redux

Responding to Tango's comments, here is the new table with a column added, totals at the bottom, and excluding bunts.


Year    Type         BIP       H  Non-HR      TB      %H %Non-HR    SLUG  %Non-HR2
2003    Fly        36744    9898    5390   27314   26.9%   14.7%   0.743     16.7%
2004    Fly        37052   10494    5786   28619   28.3%   15.6%   0.772     17.9%
2005    Fly        37268   10442    5913   28207   28.0%   15.9%   0.757     18.1%
2006    Fly        37712   10863    6034   29557   28.8%   16.0%   0.784     18.3%
----------------------------------------------------------------------------------
2003    Ground     60783   14355   14353   15687   23.6%   23.6%   0.258     23.6%
2004    Ground     60212   14267   14267   15623   23.7%   23.7%   0.259     23.7%
2005    Ground     60373   14092   14092   15388   23.3%   23.3%   0.255     23.3%
2006    Ground     59912   14367   14367   15690   24.0%   24.0%   0.262     24.0%
----------------------------------------------------------------------------------
2003    Line       25846   18985   18289   26505   73.5%   70.8%   1.025     72.7%
2004    Line       25663   18951   18208   26495   73.8%   71.0%   1.032     73.1%
2005    Line       25425   18649   18162   25240   73.3%   71.4%   0.993     72.8%
2006    Line       25902   19012   18456   26072   73.4%   71.3%   1.007     72.8%
----------------------------------------------------------------------------------
2003    Pop        10853     168     168     207    1.5%    1.5%   0.019      1.5%
2004    Pop        11007     226     226     268    2.1%    2.1%   0.024      2.1%
2005    Pop        11123     223     223     258    2.0%    2.0%   0.023      2.0%
2006    Pop        10656     238     238     309    2.2%    2.2%   0.029      2.2%
----------------------------------------------------------------------------------
2003               30639      15      14      18    0.0%    0.0%   0.001      0.0%
2004               31657       0       0       0    0.0%    0.0%   0.000      0.0%
2005               30463       1       0       4    0.0%    0.0%   0.000      0.0%
2006               31558      47      47      93    0.1%    0.1%   0.003      0.1%
----------------------------------------------------------------------------------
Totals            660848  175293  154233  281554   26.5%   23.3%   0.426     24.1%

Friday, June 15, 2007

Hit 'em Where They Aint

I was running a few numbers for a colleague and so I thought I'd post them. Below is a table going back to 2003 that shows the number of balls put into play by vector and the numbers and percentages that went for hits and non-homerun hits. Finally we have the slugging percentage earned on those batted balls.


Year    Type         BIP       H  Non-HR      TB      %H %Non-HR    SLUG
2003    Fly        36870    9898    5390   27314   26.8%   14.6%   0.741
2004    Fly        37168   10494    5786   28619   28.2%   15.6%   0.770
2005    Fly        37268   10442    5913   28207   28.0%   15.9%   0.757
2006    Fly        37856   10863    6034   29557   28.7%   15.9%   0.781
------------------------------------------------------------------------
2003    Ground     62253   14979   14977   16312   24.1%   24.1%   0.262
2004    Ground     61577   14841   14841   16198   24.1%   24.1%   0.263
2005    Ground     61809   14669   14669   15965   23.7%   23.7%   0.258
2006    Ground     61131   14910   14910   16233   24.4%   24.4%   0.266
------------------------------------------------------------------------
2003    Line       25852   18986   18290   26506   73.4%   70.7%   1.025
2004    Line       25666   18951   18208   26495   73.8%   70.9%   1.032
2005    Line       25425   18649   18162   25240   73.3%   71.4%   0.993
2006    Line       25909   19013   18457   26073   73.4%   71.2%   1.006
------------------------------------------------------------------------
2003    Pop        11124     173     173     212    1.6%    1.6%   0.019
2004    Pop        11285     236     236     278    2.1%    2.1%   0.025
2005    Pop        11370     230     230     265    2.0%    2.0%   0.023
2006    Pop        10886     239     239     310    2.2%    2.2%   0.028
------------------------------------------------------------------------
2003               30639      15      14      18    0.0%    0.0%   0.001
2004               31657       0       0       0    0.0%    0.0%   0.000
2005               30463       1       0       4    0.0%    0.0%   0.000
2006               31558      47      47      93    0.1%    0.1%   0.003

So from this you can see that just under 30% of fly balls go for hits although only 16% of non-homerun flyballs. About a quarter of ground balls and three quarters of fly balls fall in and 1 in 50 popups are hits. The remainder are not actually balls in place but rather strikeouts and unclassified plays in the data.

Monday, May 28, 2007

The 100 RBI Men

A reader makes the following observation:

"Carlos Delgado is currently on pace to get 100 RBIs for the Mets this year, despite sporting an abysmal .234 BA, 306 OBA and 359 SL%. This helps to make the case that RBIs are as much a team stat as an individual one, as the fellas hitting in front of Delgado (Jose Reyes, Carlos Beltran, and David Wright) are pretty adept at getting on base, which allows Delgado to make outs 7 times out of ten and still accumulate a decent number of RBIs. Simply put, he is getting a boatload of opportunities. Delgado also is lucky enough to bat cleanup on a team that is doing very well in the standings, so the manager is unlikely to move him out of the cleanup slot anytime soon."

Well, assuming Delgago plays in 144 games this season as he did in both 2005 and 2006 he'll wind up with 91 RBI and won't break the 100 mark. And as of tonight Delgado's OPS is 665 and taking 2006 league norms for league OPS and the park effect of Shea Stadium that means that Delgado is on pace to have a league normalized and park adjusted OPS of 92. In any case, that doesn't directly bear on the question which follows...

"So, anyways, my question: Who are the worst hitters in MLB history to get 100 RBIs in a season in MLB? And what are their stories? Were these players in similar situations to Delgado, or do they have some other tale to tell?"

To answer the first part and take a crack at the second, here are the "top" 50 players with 100 or more RBI in a single season with the lowest normalized and park adjusted OPS. There are 1,543 players with 100 or more RBI in a single season since beginning in 1901.


Name                      Year      PA       G     RBI     OPS  NOPS/PF
Joe Carter                1997     668     157     102     683       89
Vinny Castilla            1999     674     158     102     809       92
Ruben Sierra              1993     692     158     101     678       93
Tony Armas                1983     613     145     107     707       94
Paul O'Neill              2000     628     142     100     760       94
Ray Pepper                1934     598     148     101     732       94
Marv Owen                 1936     655     154     105     750       96
Joe Carter                1996     682     157     107     782       96
Glenn Wright              1927     626     143     105     716       96
Joe Carter                1990     697     162     115     681       97
Travis Fryman             1996     688     157     100     766       97
Joe Randa                 2000     665     158     106     781       97
Jeff Francoeur            2006     686     162     103     742       98
Jeff Cirillo              2000     684     157     115     869       98
Tony Batista              2004     650     157     110     728       98
Torii Hunter              2003     642     154     102     762       99
Ray Jablonski             1953     640     157     112     735       99        
Joe Pepitone              1964     647     160     100     698      100
Ernie Banks               1969     629     155     106     725      100
Carlos Beltran            1999     723     156     108     791      100
Bill Buckner              1986     681     153     102     733      100
Bill Brubaker             1936     620     145     102     736      101
George Bell               1992     670     155     112     712      101
Garret Anderson           2001     704     161     123     792      101
Travis Fryman             1997     657     154     102     766      101
Andres Galarraga          1995     604     143     106     842      101
Chili Davis               1993     645     153     112     767      101
Eddie Robinson            1953     685     156     102     735      101
Wally Pipp                1923     634     144     108     749      101
Willie McGee              1987     652     153     105     746      101
Bing Miller               1930     654     154     100     795      101
Pinky Higgins             1938     603     139     106     794      101
Butch Hobson              1977     637     159     112     789      101
Andruw Jones              2001     693     161     104     772      101
George Kelly              1929     632     147     103     760      101
Gee Walker                1939     645     149     111     773      101
Moose Solters             1936     676     152     134     802      101
Ed Sprague                1996     670     159     101     821      101
Al Simmons                1924     644     152     102     774      102
Ruben Sierra              1987     696     158     109     771      102
Billy Rogell              1934     679     154     100     766      102
Richie Sexson             1999     525     134     116     818      102
Vernon Wells              2002     648     159     100     762      102
Pinky Whitney             1930     662     149     117     849      102
Pinky Whitney             1928     636     151     103     768      102
Matt Williams             1997     636     151     105     795      102
Glenn Wright              1924     662     153     111     744      102

I know many of you had an inkling that Joe Carter would take the top spot. He also appears at numbers 8 and 10 (and number 53 for his 1987, 104 for his 1989 season, 110 for his 1993 season, and number 146 for his 1994 season...you get the idea). But given the poor light in which RBI have been cast in recent years, perhaps surprisingly only 17 times (a little over 1%) has a player ever driven in 100 runs while not being at least league average. So while getting to 100 RBI doesn't ensure that the hitter is an elite offensive performer, it is a pretty good proxy. Put in another way, one needn't be a great hitter to accrue 100 RBI but great hitters often get to 100 RBI. And so in the absence of better metrics it's not surprising that 100 RBI became shorthand for a great offensive performance. This is illustrated by the fact that the "average" 100 RBI man had NOPS/PF of 125 and the histogram belows which shows their distribution:

Contributing to the idea that RBI equals greatness is the ongoing debate over the significance and prevalence of clutch hitting. A player with alot of RBI is often automatically assumed to be a clutch performer as Joe Carter was.

That said, given that we now have much more granular means (with OPS actually being on the lower end) of estimating the run contribution of individual hitters, that usage should wane some although it may take generational turnover to bring about its demise. For a little deeper perspective on traditional and more modern methods of gauging a player's contribution see chapter 1 of Baseball Between the Numbers.

But in getting back to the question at hand, perusing the list you see a few factors that certainly play into reaching the century mark:

Performance - As mentioned above there is no doubt that in large part getting to 100 RBI requires a strong performance. From the graph above (the pink cumulative line that uses the y-axis on the right) you can see that fully 75% of those who have driven in 100 runs were 15% or more above league average and 60% were 25% or more above average.

Park - Vinny Castilla and Jeff Cirillo in the top 20 show that playing in a park where lots of runs are scored certainly helps, and of course by adjusting for park we don't give them any benefit

Era - Twelve of the top 20 players either played in the 1930s or since 1993 which were the two highest scoring eras in modern baseball history. Just like playing in a park where runs are more plentiful allows lesser hitters to drive in more runs, playing in an expanding offensive environment devalues the 100-RBI mark.

Teammates - Certainly the reader makes a good point about teammates having to be on the bases. You could probably make a case that Paul O'Neil in 2000 with the Yankees, Marv Owen for the Tigers in 1936, Glenn Wright with the Pirates in 1927, and Bill Buckner with the 1986 Red Sox all fall into this category where the individual was part of a strong offensive team from top to bottom.

Lineup Position - It probably comes as no surprise that many of the players on this list and that accrue 100 RBI generally are middle-of-the-order hitters. It probably comes as a bit more of surprise that, as shown in the graph below, the number three position in order actually hits with relatively fewer runners on base than does any other lineups positions save the leadoff and second spots in the order.

Plate Appearances - More generally the latter two contributing factors as well as this one fall into the category of opportunity. A player has to come to the plate often enough to reach the 100 RBI mark. Probably no one in this list better exemplifies these is Ruben Sierra's 1993 performance with the Oakland A's. In that season Rickey Henderson hit leadoff for half the year and he hit third in an AL lineup which increases the opportunities for the third hitter and racked up almost 700 plate appearances. On the average, the 100 RBI men had 651 plate appearances. Yes Rudy York did drive in 103 runs in just 417 plate appearances for the 1937 Tigers but all told just 213 players (14%) have ever driven in 100 runs while not coming to the plate at least 600 times.

So will Carlos Delgado get to 100 RBI and if so what does it mean? Speaking only in generalties and knowing only that about this performance, we'd have to guess that he was a pretty good hitter. However, that doesn't completely rule out the possibility that his park, era, teammates, lineup position, and playing time all conspired to his breaking the 100 RBI barrier.

Wednesday, April 11, 2007

Assignment Discovery: Sabermetrics

I was alerted by a fellow Cubs fan that the program "Statistics and Data Analysis in Sports" will be airing on the Discovery Channel on April 17th. The description of the show on their web site says:

Using only a calculator, a stat book, and some custom equations, a new generation of baseball statisticians believes that it's possible to predict a player's true value to his team. The results will surprise you.

It'll be interesting to see if they're really talking about "prediction" or simply quantification after the fact. The former has its limits while the latter is very well understood. I'm also interested in these types of presentations since they often misrepresent and distort subjects that are somewhat technical. I wrote about two depictions of sabermetrics back in November in a column titled "The Numb3rs Game" on BP.

Saturday, January 20, 2007

The Power of Squares

Nice article by Dave Studeman over at Baseball Analysts on Pythagoras, run estimation and Bill James. I especially liked the following:

"The power of two is everywhere in life. E=MC squared, after all. When you move closer to a light, cutting the distance in half, the light doesn't become twice as bright...So when Bill James discovered that the nature of runs to winning is squared, it seemed as though something essential and fundamental had been discovered."

Another example of this phenomena is the inverse-square law of gravitation which Newton published in his Principia but which was first hinted at by Ismael Bullialdus and known (or guessed at) in some form to the likes of Christopher Wren, Emond Halley, and Robert Hooke as told in James Gleick's wonderful biography of Isaac Newton titled Isaac Newton.

For more thoughts on run estimation see:

Run Estimation for the Masses
A Closer Look at Run Estimation

Thursday, January 31, 2008

Yin and Yang

Friday, November 16, 2007

Breaking News!

Tuesday, August 21, 2007

Picking on Pierre

Friday, August 10, 2007

Ankiel and Bressler

Friday, July 20, 2007

Bunting and Izturis

Friday, June 22, 2007

Friday Links

Saturday, June 16, 2007

Where They Aint Redux

Friday, June 15, 2007

Hit 'em Where They Aint

Monday, May 28, 2007

The 100 RBI Men

Wednesday, April 11, 2007

Assignment Discovery: Sabermetrics

Saturday, January 20, 2007

The Power of Squares

Ads

Links

Now on Baseball Prospectus

MLB News From Ballbug

washingtonpost.com - George F. Will -- Washington Post Opinion Writer (washingtonpost.com)

Scriptorium Daily

Blog Archive

Categories

Baseball Links

Baseball Books Reviews

Articles on Other Sites

Best Of...Other Posts

Books and Book Contributions

Baseball Blogs

Other Blogs

Xbox 360

About Me