FREE hit counter and Internet traffic statistics from freestats.com

Friday, February 29, 2008

Chat Transcript 2/29

Thanks to everyone who spent a few minutes chatting this morning. Some really great questions (if a little Yankee heavy) and believe me, if other things weren't pressing I would have preferred to go on for most of the afternoon.

Wednesday, February 27, 2008

Chat 2/29

Just a quick note that I'll be chatting at Baseball Prospectus.com on Friday February 29th at 11 AM Mountain time. I've never chatted on "leap day" before so this should be fun :)

Seriously, if you have questions around my essays in Baseball Prospectus 2008 on baserunning and throwing arms or anything else that interests you, please submit them early as that always provides you with better answers.

Monday, February 25, 2008

WWBD?

In a new column on Baseball Prospectus I ask the question on every one's mind, "What Would Bacon Do?" (WWBD?).

No, not Kevin Bacon but Sir Francis Bacon the English philosopher and statesman. The question of course is what would Bacon think of Derek Jeter's defensive ability and the recent hub-ub surrounding it. Using Bacon's new-fangled inductive method of reasoning and his "idols" construct for categorizing the frailties of human reasoning, we come to a conclusion regarding The Captain. Enjoy.

Sunday, February 24, 2008

Baserunning for the Ages


I've been writing mostly about defense over at Baseball Prospectus and here in the past couple of months and so I thought I'd revisit the topic of baserunning today.

Although in my column I talked about the greatest baserunners of the Retrosheet era and here have extolled the virtues of Tim Raines, Robin Yount and others, I noticed that I've failed in either venue to list the top and bottom single seasons of that era.

So today here are the top and bottom 50 baserunning seasons in terms of total Equivalent Baserunning Runs (EqBRR). And don't forget that an essay describing the system and its metrics titled "The Tortoise, the Hare, and Juan Pierre" appears in Baseball Prospectus 2008, which I'm told, is shipping now. Also, I'll be talking about the book with Christina Kahrl and Nate Silver Wednesday, March 5th at 7:30 pm at the Tattered Cover Bookstore in Historic LoDo in Denver.

First, the top 50 seasons.


Name Year EqGAR EqSBR EqAAR EqHAR EqOAR TotOpps EqBRR EqBRRate
Maury Wills 1962 0.8 13.8 2.3 3.6 0.1 848 20.6 1.26
Willie Wilson 1980 0.5 10.0 4.2 3.9 0.5 741 19.1 1.35
Rickey Henderson 1985 2.6 9.8 1.0 3.1 2.3 663 18.8 1.43
Vince Coleman 1986 0.0 11.0 0.3 0.9 4.2 627 16.4 1.26
Willie Wilson 1979 2.1 9.2 0.7 2.8 0.9 563 15.7 1.33
Eric Davis 1986 -0.3 10.2 0.5 1.7 3.2 403 15.3 1.55
Rickey Henderson 1988 2.3 9.0 1.3 -2.0 3.8 650 14.4 1.27
Vince Coleman 1987 2.5 7.1 1.2 2.4 0.8 723 14.1 1.28
Tim Raines Sr 1983 0.9 7.6 1.8 2.7 0.6 762 13.6 1.22
Bobby Bonds 1972 2.0 6.0 1.6 3.6 0.5 517 13.6 1.42
Kenny Lofton 1993 1.5 8.1 -0.1 4.3 -0.5 717 13.3 1.17
Lonnie Smith 1985 2.8 1.4 2.5 2.9 3.1 460 12.8 1.85
Willie Wilson 1987 0.4 7.1 0.6 3.9 0.7 581 12.6 1.24
Juan Pierre 2007 0.3 4.2 1.9 5.0 0.6 666 12.0 1.40
Davey Lopes 1978 0.6 5.8 1.9 1.4 2.1 580 11.8 1.24
Vince Coleman 1985 1.2 10.7 0.7 0.6 -1.4 665 11.8 1.06
Rickey Henderson 1983 0.9 7.8 1.3 0.1 1.6 609 11.7 1.17
Gary Pettis 1985 2.1 7.4 0.7 0.2 1.3 470 11.7 1.31
Johnny Damon 2000 3.1 2.7 -0.7 4.2 2.2 743 11.5 1.33
Ron LeFlore 1979 2.3 5.6 1.3 1.9 0.4 602 11.5 1.30
Davey Lopes 1975 0.4 8.6 -1.3 2.2 1.4 685 11.4 1.14
Barry Larkin 1995 1.3 5.5 1.2 1.3 1.9 526 11.2 1.31
Tim Raines Sr 1985 1.6 9.4 0.8 -0.6 0.0 726 11.2 1.08
Willie Wilson 1983 1.9 7.0 -0.6 1.4 1.5 532 11.2 1.24
Ivan DeJesus 1978 1.0 2.2 1.7 4.5 1.7 625 11.1 1.38
Hanley Ramirez 2006 3.1 1.7 1.3 3.5 1.5 576 11.0 1.50
Joe Morgan 1975 0.8 6.3 0.9 2.0 0.9 618 10.8 1.27
Ron LeFlore 1980 0.1 11.0 0.3 -0.1 -0.5 611 10.7 0.99
Tim Raines Sr 1987 -0.5 7.2 0.3 2.5 0.9 567 10.4 1.17
Ray Durham 1998 1.2 2.0 1.8 3.6 1.8 611 10.4 1.37
Juan Pierre 2005 3.6 1.7 -0.3 2.5 2.9 665 10.4 1.43
Chone Figgins 2005 4.4 0.7 1.9 2.5 0.8 619 10.3 1.43
Marquis Grissom 1992 -0.3 6.5 2.1 0.1 1.8 567 10.2 1.19
Rickey Henderson 1984 0.9 2.4 2.1 3.9 0.9 599 10.2 1.31
Tim Raines Sr 1982 3.3 5.7 -0.2 1.2 0.1 626 10.1 1.25
Tim Raines Sr 1992 0.3 5.3 1.5 3.0 0.1 669 10.1 1.24
Al Wiggins 1983 0.3 4.1 0.9 0.4 4.4 582 10.0 1.29
Bert Campaneris 1969 1.3 5.8 -0.5 2.5 0.8 565 9.9 1.29
Kenny Lofton 1994 1.7 4.5 -1.1 3.4 1.4 574 9.9 1.23
Ron LeFlore 1978 2.3 5.2 0.6 2.3 -0.6 720 9.9 1.20
Chone Figgins 2006 2.1 0.8 1.6 4.9 0.4 577 9.9 1.48
Omar Moreno 1979 1.5 3.8 0.7 3.0 0.8 619 9.8 1.29
Scott Podsednik 2003 1.8 4.6 1.0 3.2 -0.6 542 9.8 1.34
Joe Morgan 1976 0.1 5.5 0.4 2.7 1.0 504 9.7 1.28
Tom Goodwin 2000 1.0 3.9 1.3 2.9 0.5 445 9.7 1.37
Milt Cuyler 1991 0.9 3.1 2.0 0.8 2.8 438 9.7 1.47
Davey Lopes 1976 0.6 7.3 0.5 1.7 -0.5 508 9.7 1.12
Willie Wilson 1984 1.7 5.1 2.1 0.8 -0.1 530 9.7 1.24
Tony Gwynn 1987 0.3 2.0 1.4 3.5 2.3 682 9.6 1.36
Luis Castillo 2007 1.0 0.6 1.1 3.1 3.8 528 9.6 1.51


As you can probably tell most of these top 50 are driven by high EqSBR scores with Maury Wills 1962 season at the top of the heap. There were a few players though, like Lonnie Smith in 1985 who excelled despite pedestrian work stealing bases. Related to that point you'll also notice that the final column we have a rate statistic titled EqBRRate. I explained the idea behind it in the column cited above.

...it turns out that when many folks talk about baserunning, they're not really thinking about discretionary stolen base attempts, but instead that combination of speed, risk taking, and judgment that goes into evaluating situations, and that when aggregated leads us to assert that a particular player is a good baserunner. Since EqSBR is primarily comprised of stolen base attempts (with a few pickoffs thrown in) our new rate statistic, which we'll christen Equivalent Baserunning Rate (EqBRRate), will omit this aspect.

So...we'll define EqBRRate as the ratio of actual or total runs to expected runs contributed across the four remaining metrics. Since both values for individual opportunities consider the context and are weighted appropriately (an EqOAR opportunity has both a lower expected and usually actual run value associated with it than an opportunity of EqHAR) and since we're eliminating EqSBR, both weaknesses discussed above addressed. To illustrate how this works, let's consider Chone Figgins' 2007 season:


Metric Opps TotRuns ExpRuns
EqOAR 314 0.1 1.2
EqGAR 31 3.2 2.6
EqAAR 44 5.7 4.7
EqHAR 53 10.1 5.6
442 19.0 14.2


Taking the context of the opportunities into account, we would have expected Figgins to net +14.2 runs across the four metrics, but he actually contributed +19.0 runs. When we divide the total by the expected we get a ratio of 1.34, indicating that he contributed, or manufactured if you will, 34 percent more runs than expected.
So as you can see Smith's 1985 season actually comes out on top with a EqBRRate of 1.85 easily topping the 1986 season of Eric Davis at 1.55. The lowest rate out of the top 50 is Ron LeFlore's 1980 season where he was actually below 1.00 at 0.99 but made the top 50 by virtue of his 11.0 EqSBR.

And now the bottom 50...


Name Year EqGAR EqSBR EqAAR EqHAR EqOAR TotOpps EqBRR EqBRRate
Carlos Delgado 2007 -0.4 -0.2 -2.9 -3.6 -0.4 394 -7.5 0.41
Jose Offerman 1992 0.5 -4.9 -1.3 -1.1 -0.7 505 -7.5 0.86
Eddie Murray 1990 -1.2 -4.2 0.2 -3.5 1.1 521 -7.5 0.82
Scott Cooper 1994 0.1 -2.3 -0.4 -5.2 0.3 306 -7.5 0.45
Rick Wilkins 1996 -0.4 -1.7 -1.3 -4.3 0.2 264 -7.6 0.41
Pat Burrell 2005 -0.5 -0.2 -0.7 -5.6 -0.7 368 -7.6 0.42
Randy Milligan 1991 -0.5 -3.7 -0.5 -2.1 -0.8 414 -7.6 0.67
Manny Trillo 1982 -0.1 -4.1 -1.9 -1.9 0.3 430 -7.6 0.72
Larry Sheets 1988 -0.7 -4.3 -0.8 -1.8 -0.2 286 -7.7 0.58
Bill Freehan 1971 -1.1 -3.1 -1.8 -1.7 -0.1 331 -7.8 0.43
Dave Parker 1977 -0.4 -6.4 -0.9 0.7 -0.8 531 -7.8 0.93
Jorge Posada 2007 -1.3 -0.1 -0.7 -4.2 -1.6 507 -7.8 0.59
Ted Simmons 1975 -1.5 -1.7 0.3 -4.8 -0.2 573 -7.9 0.72
Ed Bailey 1963 0.0 -3.9 -0.4 -3.3 -0.4 231 -7.9 0.35
Gene Richards 1982 -1.3 -4.8 0.0 -2.0 0.1 455 -8.0 0.81
Jim Thome 2004 -0.5 -1.4 -3.2 -3.2 0.3 436 -8.0 0.59
David Justice 1997 -0.7 -3.3 -0.3 -3.0 -0.7 398 -8.0 0.61
Dan Meyer 1977 -0.9 -2.0 -1.3 -4.6 0.7 409 -8.1 0.60
Jim Norris 1977 -0.1 -3.8 0.2 -2.9 -1.5 447 -8.1 0.73
Felix Jose 1991 -1.8 -3.5 -2.5 0.7 -1.0 482 -8.1 0.62
Lance Parrish 1979 0.0 -3.7 -2.6 -1.5 -0.4 361 -8.1 0.62
Steve Garvey 1980 -0.4 -5.5 -0.9 -0.8 -0.6 467 -8.1 0.78
Steve Henderson 1983 0.2 -7.5 0.6 -0.7 -0.8 362 -8.2 0.93
Carlos Delgado 2000 -0.6 -1.4 -1.4 -3.6 -1.1 618 -8.2 0.58
Al Oliver 1977 -0.1 -7.0 -0.2 -0.1 -0.8 474 -8.2 0.88
Tom Foley 1988 -0.1 -4.4 -0.4 -1.6 -1.7 279 -8.3 0.53
Dale Berra 1984 1.2 -2.8 -0.9 -5.2 -0.7 297 -8.4 0.57
Tommy McCraw 1972 -0.5 -5.9 -1.5 -0.5 0.0 299 -8.4 0.76
Joe Cunningham 1959 -0.8 -2.3 0.4 -5.3 -0.4 504 -8.5 0.60
Pete O'Brien 1984 -0.4 -3.1 -1.4 -2.9 -0.6 428 -8.5 0.52
Kenny Lofton 1997 0.4 -9.4 -0.1 0.6 0.0 539 -8.5 1.06
Frank Thomas 2000 -1.0 -1.6 -1.6 -3.4 -1.0 535 -8.6 0.47
Harold Reynolds 1988 -0.7 -7.8 1.0 -1.2 0.0 535 -8.7 0.95
Eddie Yost 1957 0.3 -5.4 -0.9 -2.8 0.0 394 -8.7 0.62
Joe Morgan 1980 -0.2 -2.0 -0.3 -6.5 0.3 434 -8.8 0.55
Clete Boyer 1970 0.6 -1.6 -0.2 -6.9 -0.8 307 -8.8 0.21
Ed Kranepool 1967 -0.8 -1.9 -0.1 -5.7 -0.4 323 -8.8 0.39
Mike Hargrove 1979 -1.8 -2.3 -0.2 -4.2 -0.5 522 -8.8 0.69
Jose Cruz 1977 -0.1 -8.3 0.5 -1.3 0.3 483 -8.9 0.95
Ernie Whitt 1989 -1.4 -1.0 -1.7 -4.0 -0.9 313 -9.0 0.34
Harmon Killebrew 1970 -0.8 -2.1 -0.7 -4.6 -1.0 435 -9.0 0.56
Tom Pagnozzi 1991 -1.5 -4.8 -0.7 -1.5 -0.8 391 -9.3 0.61
Bob Kearney 1984 -0.1 -2.6 -2.0 -3.8 -0.9 290 -9.4 0.29
Ryan Garko 2007 -1.0 -0.4 0.2 -7.7 -0.5 403 -9.4 0.32
Rod Carew 1982 0.1 -5.5 -0.2 -3.2 -0.7 558 -9.5 0.71
Tom Brunansky 1992 -0.6 -3.1 -1.8 -3.4 -0.7 373 -9.6 0.43
Todd Zeile 1997 -0.5 -4.8 0.0 -4.5 -0.4 484 -10.2 0.64
Mike Epstein 1971 -0.2 -2.5 -2.4 -4.4 -1.1 311 -10.6 0.38
Todd Zeile 1992 -0.7 -4.0 -1.1 -5.4 0.0 425 -11.2 0.52
Tony Bernazard 1984 -0.1 -6.9 -0.3 -3.4 -0.6 364 -11.2 0.62


There are some real "ice wagons" here (in the parlance of the dead ball era referring to poor baserunners) of course although many of these as well are here because they did so poorly in EqSBR (Tony Bernazard, Jose Cruz, and Kenny Lofton are three notable examples). The Indians Ryan Garko makes the list with his 2007 campaign primarily because he was thrown out trying to advance on hits five times - all at the plate - on his way to an historic -7.7 runs on EqHAR.

From a pure baserunning perspective Clete Boyer with his 1970 campaign ranks dead last at 0.21 with Bob Kearney in 1984 at 0.29 and Garko close behind. Lofton's 1997 season is the only one with a rate of 1.00 or higher at 1.06 but his EqSBR of -9.4 sinks him.

Thursday, February 21, 2008

Maybe He is but Maybin He Isn't?

Today in my column on Baseball Prospectus I take a look at minor league outfielders using the SFR system. As I've done previously I take a look at the leaders and trailers at all levels before aggregating across all positions and leagues to list the best and worst defenders in the outfield among minor leaguers in 2007.

In addition, I examine Baseball America's top outfield and infield defenders published in their Baseball America Prospect Handbook for 2008 and compare them to what SFR thinks. From an overall perspective the subjective and objective systems largely agree although Marlins top prospect Cameron Maybin is one of the few players on which the two approaches seem to disagree violently. Maybin fared poorly across all hit types in his 70 games in centerfield while at Lakeland and overall including his time in the Eastern League ended up at -15.4 runs on 421 balls fielded. It certainly is the case that the smaller sample sizes in the minor league seasons tend to make the results less reliable than for major leaguers but still, I was somewhat surprised given the glowing assessments by so many. If there are any fans out there who've watched him play in person I'd be interested in hearing what you think.

And as always a spreadsheet is now available with all 4000+ player, position, and league combinations.

Tuesday, February 19, 2008

Baseball Research Journal vol 36

As talked about in a previous post the Baseball Research Journal volume 36 containing the article by Neal Williams and myself on quantifying third base coaches is now available.

One of the articles you may want to check out is "An Analysis of the Gyroball" by Alan Nathan and David Baldwin wherein they do an analysis similar to the one I did back in July in my Schrodinger's Bat column (no subscription required).

For those who are not SABR members copies of the Baseball Research Journal 2007 can be ordered from the University of Nebraska Press, 1-800-755-1105 (U.S. Orders), or http://www.nebraskapress.unl.edu.

Monday, February 18, 2008

The Captain

What everyone knew already but that is now gaining a little mainstream press.

Derek Jeter is not a good defensive shortstop.

The above revelation is a part of the presentation that Shane Jensen and colleagues gave in discussing his Spatial Aggregate Fielding Evaluation system or SAFE at a recent AAAS meeting in Boston.

Of course, as is typical, the popular press (the Popular Science article) gets the details wrong and attributes Defensive Efficiency Ratio (DER) to David Pinto instead of his system called Probabilstic Model of Range (PMR).

Thursday, February 14, 2008

Defending in the Wide Open Spaces

Today in my column on Baseball Prospectus I officially roll out version 1.0 of Simple Fielding Runs (a defensive system based only on Retrosheet-style play by play codes) for the outfield after doing the same for infielders back in January.

For those who read the column a couple weeks ago where I re-did the methodology to follow a more WOWY (with-or-without-you-approach), nothing has really changed with the algorithm but this time around I break outfield SFR into its underlying components by hit type, develop a rate statistic, take a look at how SFR compares to the Plus/Minus system at the level of teams, incorporate the throwing metric that is discussed in an essay in Baseball Prospectus 2008, and finally create some plots for 2007 outfielders that juxtaposes their general defense as rated by SFR and their throwing ability both using rate statistics.

As a preview of this latter point consider the following plot for 2007 center fielders where SFR rate (per 650 balls fielded) is shown on the y-axis and throwing rate (per 550 opportunities) is shown on the x-axis. Each outfielder is then placed in one of four quadrants. Those in the upper right quadrant are both good fielders and throwers (Alfredo Amezaga), moving clockwise we see good throwers but poor fielders (Bill Hall, Elijah Dukes), poor throwers and poor fielders (Juan Pierre), and finally ending with good fielders but poor throwers in the upper left (Johnny Damon, Nook Logan).



The full article contains the plots for all positions.

Oh, and all the major league data from 2003 through 2007 is available in spreadsheet form (and that includes the throwing data).

Sunday, February 10, 2008

Tigers on the Run

A kind soul points out to me that Lee Panas over at Tiger Tales takes a look at the 2007 Tigers in terms of their baserunning. To augment his post I thought I'd offer the entire Tigers team in terms of the five baserunning metrics I created.

                     Opps  EqGAR   Opps  EqSBR   Opps  EqAAR   Opps  EqHAR   Opps  EqOAR   Opps  EqBRR
Curtis Granderson 43 0.6 28 4.1 60 -0.1 74 2.2 426 -0.6 631 6.2
Gary Sheffield 16 -0.2 28 0.7 41 0.6 44 2.3 249 1.5 378 4.9
Ryan Raburn 6 0.1 2 0.3 4 0.4 17 0.6 79 0.7 108 2.1
Omar Infante 11 0.2 5 0.3 13 -0.2 14 0.1 107 0.6 150 1.0
Cameron Maybin 2 -0.2 5 0.8 1 0.0 3 0.4 26 -0.1 37 1.0
Marcus Thames 5 0.0 3 -0.2 11 0.1 15 1.2 97 -0.4 131 0.7
Placido Polanco 25 -0.3 10 0.4 49 -1.9 67 1.3 423 1.3 574 0.6
Timoniel Perez 9 0.1 2 -0.1 5 0.1 10 0.2 58 0.1 84 0.3
Brandon Inge 24 0.7 11 0.4 35 0.1 37 -1.1 257 0.1 364 0.2
Ramon Santiago 4 0.4 4 0.3 5 0.1 8 -0.6 47 -0.2 68 0.0
Mike Maroth 0 0.0 0 0.0 0 0.0 0 0.0 3 0.0 3 0.0
Jeremy Bonderman 0 0.0 0 0.0 0 0.0 0 0.0 6 0.0 6 0.0
Brent Clevlen 0 0.0 0 0.0 0 0.0 0 0.0 7 0.0 7 0.0
Mike Hessman 2 0.0 0 0.0 1 0.0 4 -0.1 20 -0.1 27 -0.2
Neifi Perez 1 0.0 0 0.0 5 0.0 3 -0.1 30 -0.1 39 -0.2
Carlos Guillen 31 -0.5 22 -1.5 42 0.4 40 1.6 285 -0.4 420 -0.5
Craig Monroe 15 -0.6 3 -1.1 11 0.0 30 0.4 115 -0.4 174 -1.6
Magglio Ordonez 39 0.5 4 0.0 49 -1.1 58 -0.8 409 -0.8 559 -2.2
Sean Casey 19 -0.2 4 -0.7 34 0.2 24 -1.5 240 -0.1 321 -2.2
Ivan Rodriguez 23 0.1 4 -1.0 33 0.1 38 -0.9 245 -0.6 343 -2.3
Mike Rabelo 10 -0.5 1 -0.4 9 -0.1 9 -1.0 92 -0.3 121 -2.3

285 0.1 136 2.1 408 -1.2 495 4.1 3221 0.3 4545 5.4


Keep in mind that the data Lee shows is really a subset of the EqHAR (Equivalent Hit Advancement Runs) shown here but as Lee points out Gary Sheffield did well in 2007 as did Guillen, Palanco, and Monroe.

Statistical Profiling?

As always Alan Schwarz has an interesting piece in the New York Times, this time around on the topic of using statistics as a benchmark for increased testing for performance enhancing substances. The idea, floated by Representative Mark Souder, an Indiana Republican, is that by comparing actual statistical performance to the player's history and performance projected on the basis of an "average" or typical career path, major league baseball would flag certain players as more likely to be users of performance enhancing drugs. Those players would be tested more frequently or more closely I assume until they passed some criteria where their new performance level is accepted as legitimate.

Schwarz then goes through the litany of reasons why such "statistical profiling" would likely be futile ranging from the inherent variability in career path for any particular player, to the problem of what one would measure to try and catch such anomalies, to the fact that the current evidence stemming from the Mitchell report is inconclusive at best.

In the piece, however, he points to the similarities in the careers of Hank Aaron and Barry Bonds as evidence for why the first point above makes it unrealistic to use career paths as a measure.

Using my unsophisticated projection system for projecting Normalized OPS (a park adjusted and league adjusted OPS taking into account a three year weighted average regressed to the mean, age and league adjusted), here are the two career mentioned above.






To me, what's interesting about these two is that in the case of Aaron it certainly is true that he had his most productive season at age 37 and his fourth most productive at age 39 with a nice year thrown in at age 35. However, these were interspersed with seasons at ages 36, 38 and 40 that were pretty much what a projection system would indicate. Essentially Aaron had a very slight decline phase with some excellent seasons interspersed.

Bonds, however, had his four best seasons by a large margin at the consecutive ages of 36, 37, 38, and 39. Clearly this indicates that he established a new performance level that was around 25% higher than his established level from ages 29 through 34. I'm certainly not saying that I would agree with Souder that this kind of profiling should be the sole criteria used to trigger a more stringent testing regime for specific players. However, it certainly seems reasonable that statistics could be included in the set of criteria used to determine whether enhanced testing is warranted (assuming that the general concept of this second level is even accepted). In the case of Bonds, his associations and suspicions of club officials, physical appearance, and performance on the field should have combined to tip the balance in favor of increased scrutiny.

Of course it's also true as Schwarz indicates that just what statistics would be used would be problematic. Here we're looking at overall productivity but intertwined in OPS is both a measure of power (which is usually argued as the tell tale sign of steroid use but is more problematic when looking at other substances like human growth hormone) and patience. For Bonds, both components increased greatly as his power scared the daylights out of opposing teams to the extent that they would walk or pitch around him any time a runner was on base. There's no reason to believe that would necessarily be the case as a general rule.

For Aaron there were (ostensibly) no other circumstances that raised red flags and so on the strength of his career path alone that kind of scrutiny wouldn't be warranted. The reason, as Schwarz articulates, is that career paths do indeed vary significantly. For example, consider the case of Carlton Fisk.



Fisk showed a steady decline from his age 26 season through age 34 and then had a resurgent age 35 season in 1983 with the White Sox. After continuing the decline through age 39 he suddenly enjoyed three consecutive seasons at productivity levels he hadn't seen since his mid-20s albeit doing so in fewer plate appearances.

And then of course there are those players about whom there are whispers but no actual evidence coupled with a career path that could be interpreted in both ways. A case in point is Sammy Sosa.



Sosa's rise is a little earlier starting at age 29 and maxing out at age 32 and there is also other evidence including a changed approach at the plate under the tutelage of Jeff Pentland and certainly enhanced weight training (with the use of creatine); all of the above making it more than a little dicey to base enhanced testing on the statistical record alone.

With that said, the case is a little more convincing when looking at Mark McGwire.


Like Bonds, his established level of performance jumped at a rather late age (31) and was sustained through age 36 (at age 29 he had just 112 plate appearances and .333/.427/.726). If this kind of increase were coupled with allegations by former teammates and the use of the steroid precursor androstenedione (although legal at the time), then it just may rise to the level that Souder is talking about. It should be noted, though, that Jose Canseco did not (as far as I know) finger McGwire or anyone else while McGwire was still active although from the Mitchell report it is clear that both Tony LaRussa and Dave McKay (and possibly Sandy Alderson although he denies it) knew that Canseco was using steroids and did not report it. Had they done so, it should have cast suspiscion on McGwire's 1996 and 1997 performances while still with the A's.

In the final analysis while I believe that statistics could by one data point in a much more complex evaluation system, they should not be used blindly like Souder seems to be indicating. Baseball, like other human activities, is simply too dynamic and there are too many interacting variables in play to warrant that kind of simplistic system.

Thursday, February 07, 2008

Baseball's Toughest Division

Which division is baseball's toughest?

Well, if you listen to the media you'll no doubt respond that the AL Central is clearly the toughest division in baseball. Having heard that so often in the past couple months in the wake of the Tigers deal for Dontrelle Willi and Miguel Cabrera and the Johan Santana trade last week, I decided to take a look based on the actual performance of the divisions in intradivisional play as well interleague results stretching back to 1997. The end result is discussed this week in my column at Baseball Prospectus.

In the second half of the column I take a look at the simple projection system I created and wrote about several months ago. This time around I have it project into the future and show the top 2008 projections in terms of Normalized and Park Adjusted OPS. From there I take a look at the where the projections differ the most as well as the track record in graphical form of the projections for Magglio Ordonez, Alex Rodriguez, Andruw Jones, Torii Hunter, Gary Sheffield, and Ken Griffey Jr. Enjoy.

The Traffic Directors Addendum

By now SABR members should have received their copies of volume 36 of The Baseball Research Journal. For me receiving the journal in large part pays for the cost of the SABR membership as there are always more than a few interesting articles to peruse. Although I haven't read the entire journal yet the articles on strategy in Japanese baseball (titled "The Evolution of Japanese Baseball Strategy") by Robert K. Fitch and "A Manifesto for Defensive Baseball Statistics" by Dr. Jon Bruschke certainly caught my eye.

This year, however, I have another reason to be interested with the inclusion of an article by Neal Williams and me titled "The Traffic Directors". Some readers will be familiar with this topic since an online version of the article appeared in two parts last spring over on Rich Lederer's Baseball Analysts site. And so if you don't get the journal you can read about the methodology there.

In presenting the article to a group of SABRites here in Colorado a few weeks ago I went back and took a look 2007 data and at the rest of the data set that Neal had collected in response to a few questions by those assembled.

First, let's take a look at the 2007 third base coaches ordered by ratio.


Coach Non-Coach
Year Team Coach Opps OA EqHAR ExHAR Rate Opps OA EqHAR ExHAR Rate Ratio
2007 BOS DeMarlo Hale 240 6 0.4 47.0 1.01 450 9 -10.8 45.0 0.76 1.33
2007 COL Mike Gallego 266 4 8.5 62.2 1.14 429 12 -3.4 44.3 0.92 1.23
2007 HOU Doug Mansolin 230 8 -5.0 50.9 0.90 349 4 -8.1 32.5 0.75 1.20
2007 DET Gene Lamont 299 6 5.5 53.4 1.10 412 7 -2.7 43.0 0.94 1.18
2007 ATL Brian Snitker 245 5 1.8 50.3 1.04 400 4 -4.1 37.3 0.89 1.16
2007 TBA Tom Foley 229 2 4.0 45.1 1.09 386 6 -0.5 39.7 0.99 1.10
2007 SEA Carlos Garcia 246 3 2.5 55.0 1.05 476 9 -2.4 49.9 0.95 1.10
2007 SFN Gene Glynn 221 3 1.3 40.1 1.03 365 10 -2.1 40.2 0.95 1.09
2007 SDN Tim Flannery 197 5 3.3 44.0 1.07 347 6 -0.2 34.5 1.00 1.08
2007 MIL Nick Leyva 182 4 0.0 40.5 1.00 301 5 -1.7 33.1 0.95 1.06
2007 NYN Sandy Alomar 217 3 -0.2 49.6 1.00 359 11 -1.7 36.6 0.95 1.04
2007 PHI Steve Smith 263 8 -1.3 52.7 0.98 359 8 -2.0 38.2 0.95 1.03
2007 CLE Joel Skinner 235 6 0.1 55.0 1.00 391 6 -0.7 40.8 0.98 1.02
2007 ANA Dino Ebel 291 6 4.8 49.5 1.10 397 11 3.9 40.5 1.10 1.00
2007 KCA Brian Poldber 219 4 4.6 47.7 1.10 357 8 3.0 30.8 1.10 1.00
2007 OAK Rene Lacheman 242 4 -2.7 50.8 0.95 370 7 -1.9 39.5 0.95 0.99
2007 SLN Jose Oquendo 271 8 -1.0 48.5 0.98 387 7 -0.5 35.5 0.98 0.99
2007 NYA Larry Bowa 309 3 2.3 62.2 1.04 413 8 3.4 46.3 1.07 0.97
2007 LAN Rich Donnelly 258 5 3.4 49.9 1.07 419 5 4.7 42.0 1.11 0.96
2007 FLO Bo Porter 209 4 -4.2 39.4 0.89 361 7 -1.5 35.1 0.96 0.94
2007 PIT Jeff Cox 227 4 -1.7 45.5 0.96 375 6 1.8 44.5 1.04 0.92
2007 ARI Chip Hale 172 6 -2.9 42.5 0.93 336 8 1.7 31.7 1.05 0.89
2007 BAL Juan Samuel 231 4 -5.6 48.4 0.88 397 8 0.3 40.1 1.01 0.88
2007 CHA Razor Shines 194 5 -1.2 39.9 0.97 304 5 3.0 28.2 1.11 0.88
2007 TOR Brian Butterf 208 7 -6.6 38.7 0.83 374 9 -1.7 38.0 0.96 0.87
2007 CHN Mike Quade 241 6 -1.6 49.5 0.97 386 8 5.3 41.6 1.13 0.86
2007 MIN Scott Ullger 236 9 0.2 50.7 1.00 388 6 5.9 33.8 1.17 0.86
2007 CIN Mark Berry 209 8 -8.9 49.3 0.82 363 8 -1.3 38.1 0.96 0.85
2007 WAS Tim Tolman 203 7 -4.7 41.4 0.89 347 4 2.2 35.6 1.06 0.84
2007 TEX Don Wakamatsu 216 4 5.0 44.9 1.11 362 3 12.2 34.0 1.36 0.82


Both league champions bubble to the top of the list but those who've read the articles will know that this doesn't reflect much in terms of skill. This point was validated by looking at the careers of third base coaches and splitting the even and odd years in order to see if there was any persistence in ratio (defined as the ratio of the "coach-influenced" and "non-coach-influenced" opportunities). The plot below shows the career halves for the 35 coaches with the most experience from 1993 through 2007.



So again with a coefficient of determination of .03 there is no correlation for ratio indicating that there is precious little skill (or rather skill difference) between coaches at the major league level - assuming of course that the metric would capture the influence of a coach if it were present.

Finally then, here is the complete data set of the 125 coaches from 1993 through 2007 and how they did in terms of ratio.


Coach Non-Coach
Name Season Opps OA EqHAR Opps OA EqHAR Ratio
Greg Riddoch 1 77 3 -1.1 200 9 -7.4 1.51
Billy Hatcher 2 387 6 5.5 573 21 -11.4 1.33
Eddie Rodriguez 1 165 4 -1.7 340 10 -9.7 1.32
Bill Dancy 2 528 15 4.3 736 17 -13.1 1.26
Christopher Bando 3 570 15 -3.1 926 23 -19.0 1.24
Lance Parish 1 189 5 0.9 243 8 -4.2 1.24
Michael Cubbage 2 494 12 3.6 706 15 -11.6 1.23
Gary Allenson 3 517 22 -15.2 786 27 -20.2 1.18
Terry Bevington 2 439 12 -3.5 544 11 -8.7 1.17
DeMarlo Hale 2 489 11 -6.8 874 17 -18.4 1.17
Brian Snitker 1 245 5 1.8 400 4 -4.1 1.16
Terry Francona 1 200 4 0.2 349 9 -5.2 1.16
Al Pedrique 1 223 2 5.7 308 4 -0.8 1.16
Bobby Floyd 1 173 5 -2.7 316 8 -5.9 1.15
Dave Myers 4 986 16 8.1 1463 35 -12.8 1.14
Tom Foley 6 1285 22 17.8 1995 49 -9.5 1.14
Ray Knight 2 341 5 7.0 562 15 -1.8 1.13
Rick Burleson 3 706 13 7.5 1081 33 -3.6 1.12
John Russell 3 672 19 -1.1 1096 24 -10.7 1.11
Jack Lind 3 556 12 3.3 1037 30 -8.6 1.11
Joe Amalfitano 6 1067 28 4.7 2097 56 -12.0 1.11
Eddie Rodriquez 2 475 11 -5.5 614 16 -7.8 1.10
David Oliver 2 507 6 4.2 786 14 -3.8 1.09
Nick Leyva 5 857 15 5.1 1524 30 -8.3 1.09
Jeff Datz 1 274 5 -0.4 400 7 -3.3 1.09
Willie Randolph 8 2001 40 18.4 2930 70 -6.1 1.09
Mike Gallego 3 753 12 10.8 1157 31 -2.6 1.09
Gene Lamont 7 1622 37 13.4 2496 61 -7.2 1.08
Gary Pettis 3 530 18 -3.4 741 17 -7.2 1.08
Ron Wotus 1 191 3 2.2 416 7 -0.9 1.07
Rene Lachemann 4 789 21 -12.7 1465 39 -14.9 1.06
Tom Trebelhorn 6 1323 32 4.4 2101 51 -4.8 1.06
Cookie Rojas 7 1392 38 -16.6 2374 56 -21.8 1.06
Jim Riggleamn 1 270 7 -1.5 308 11 -2.8 1.06
Juan Samuel 4 857 15 2.2 1373 31 -3.9 1.06
Ron Oester 2 407 11 0.4 571 20 -2.4 1.05
Sam Perlozzo 7 1564 31 5.0 2249 47 -3.6 1.05
Joel Skinner 6 1322 33 16.3 2041 47 2.6 1.05
Gene Glynn 8 1815 43 -16.4 2563 50 -17.3 1.05
Manny Acta 5 1032 17 15.7 1495 37 6.0 1.05
Ozzie Guillen 2 345 10 1.2 632 19 -1.9 1.04
Don Zimmer 1 147 4 0.3 262 6 -0.9 1.04
Joey Cora 1 234 9 -7.5 404 9 -6.9 1.04
Duffy Dyer 5 1039 26 7.8 1830 42 1.8 1.04
Carlos Garcia 2 472 9 1.3 853 22 -2.5 1.04
Ron Washington 8 1973 51 5.0 2590 45 -2.0 1.04
Tony Beasley 1 241 6 1.2 312 9 -0.4 1.04
Dino Ebel 2 529 9 15.8 770 20 8.8 1.04
Joel Youngblood 1 189 2 7.5 294 8 4.4 1.03
Steve Boros 3 570 14 1.6 901 28 0.5 1.03
Chuck Cottier 1 237 7 1.2 362 7 -0.2 1.03
Dale Sveum 3 789 18 -20.7 1201 26 -19.1 1.03
Cletis Boyer 2 478 13 -6.2 776 18 -6.8 1.03
Dave Bristol 1 194 9 -6.7 408 9 -6.3 1.02
Ron Jackson 2 515 8 1.6 778 20 0.2 1.02
Ron Hassey 1 186 3 0.1 342 8 -0.7 1.02
John Vukovich 7 1505 40 -7.3 2235 62 -5.3 1.02
Tim Foli 3 618 26 -0.8 844 27 -1.1 1.02
Sonny Jackson 4 843 25 -15.7 1241 31 -13.0 1.02
Al Newman 4 889 24 1.9 1384 28 1.5 1.01
Bobby Dews 2 397 17 -2.1 745 22 -3.6 1.01
John McLaren 2 507 7 -6.2 740 18 -5.8 1.01
Carlos Tosca 3 713 13 0.9 968 17 -0.6 1.01
Doug Mansolino 8 1769 40 4.9 2580 38 4.8 1.01
Jerry Narron 8 1788 35 23.1 2960 60 15.4 1.01
Luis Silverio 2 449 9 6.1 787 19 4.4 1.01
Rich Hacker 1 234 5 3.3 391 7 2.5 1.00
Matt Galante 6 1179 33 -3.2 1929 54 0.2 1.00
Bryan Little 2 411 4 11.0 700 16 8.8 1.00
Ken Macha 2 382 12 3.2 638 17 2.4 1.00
Brian Butterfield 7 1463 35 -5.6 2365 56 0.5 1.00
Brian Poldberg 1 219 4 4.6 357 8 3.0 1.00
Ned Yost 3 590 21 -7.6 797 24 -0.7 1.00
Mike Cubbage 7 1397 45 8.7 2383 65 13.0 1.00
Rob Picciolo 3 704 11 3.9 1163 24 7.3 0.99
Sandy Alomar 3 704 14 11.4 1042 26 10.7 0.99
Harry Dunlop 1 192 9 -5.5 398 8 -5.5 0.99
Perry Hill 2 401 10 0.3 684 21 1.5 0.98
Wendell Kim 9 1864 58 -27.1 3095 65 -6.5 0.98
Jose Oquendo 8 1887 41 23.8 2654 56 23.8 0.98
Dave Huppert 1 240 4 -1.3 318 7 -0.1 0.98
Steve Smith 7 1493 34 -5.8 2449 51 1.1 0.98
Jerry Royster 1 200 5 -2.3 366 11 -1.0 0.98
Bruce Bochy 2 381 9 -13.2 589 14 -7.9 0.97
Rich Donnelly 13 2868 84 -3.9 4287 92 8.1 0.97
Tommie Reynolds 4 719 14 6.4 1255 26 9.0 0.97
Tim Flannery 7 1526 39 4.2 2181 49 14.4 0.97
Jimmy Williams 4 841 26 -10.7 1208 30 -2.0 0.97
Glenn Hoffman 7 1541 42 -12.9 2019 47 -3.2 0.97
Larry Bowa 9 1870 37 -6.7 2866 55 5.8 0.97
Tim Raines 1 204 9 2.9 335 7 3.3 0.97
Jeff Newman 6 1313 32 -6.7 2143 43 6.2 0.97
Jerry Manuel 4 752 25 -18.4 1307 39 -8.1 0.96
Marc Bombard 1 170 1 6.8 301 5 6.8 0.96
Ron Gardenhire 5 1155 31 -2.4 1482 36 6.3 0.95
Bucky Dent 1 245 7 -0.5 401 9 2.0 0.94
Fredi Gonzalez 6 1249 25 0.6 2005 32 16.2 0.94
Rafael Santana 2 408 8 0.4 717 12 5.9 0.94
Jeff Cox 6 1246 32 -8.9 2057 36 5.0 0.94
Ron Roenicke 6 1538 40 4.0 1977 34 17.3 0.94
Bo Porter 1 209 4 -4.2 361 7 -1.5 0.94
Tom Gamboa 1 206 4 0.4 354 9 3.1 0.93
Dan Radison 1 227 4 2.1 331 8 4.1 0.93
Dave Oliver 2 406 4 8.0 621 11 11.3 0.92
John Mizerock 2 478 10 -0.5 790 13 6.7 0.91
Pete MacKanin 3 521 17 -4.6 846 20 5.1 0.91
Scott Ullger 5 1212 29 9.1 2058 40 26.4 0.91
Graig Nettles 1 236 8 -4.2 346 10 0.2 0.91
Gene Glenn 3 655 20 -10.9 1245 31 2.7 0.91
Tony Muser 3 557 16 -0.6 903 22 9.7 0.91
Luis Sojo 2 558 16 -5.6 718 12 5.3 0.91
Tony Harrah 1 272 11 -7.1 431 7 -0.9 0.91
Rich Dauer 5 1085 29 -1.3 1665 29 15.6 0.90
Tom Spencer 1 220 5 0.2 355 7 5.0 0.90
Chip Hale 1 172 6 -2.9 336 8 1.7 0.89
Mark Berry 4 893 26 -19.8 1274 25 2.0 0.88
Razor Shines 1 194 5 -1.2 304 5 3.0 0.88
Trent Jewett 2 354 10 1.4 454 10 7.1 0.87
Mike Quade 1 241 6 -1.6 386 8 5.3 0.86
Bobby Meacham 1 199 4 2.4 359 5 8.4 0.84
Tim Tolman 1 203 7 -4.7 347 4 2.2 0.84
John Sterns 1 206 10 -8.2 253 10 -0.1 0.83
Don Wakamatsu 1 216 4 5.0 362 3 12.2 0.82
Richard Tracewski 3 609 12 -3.1 1052 14 24.8 0.79
Chris Speier 4 865 22 -6.9 1160 15 25.1 0.78


For those scoring at home yes, Waving Wendall Kim did have the lowest overall EqHAR value at -27.1 runs with 58 runners caught on the bases in 9 seasons. However, when you consider non-coach-influenced opportunities he comes in just below an even ratio at 0.98 and 79th on our list.