FREE hit counter and Internet traffic statistics from freestats.com

Thursday, May 31, 2007

The Physics of Drag

My column today on Baseball Prospectus delves once again into the PITCHf/x data tracked by the new Gameday application. This time I take a look at the drag on a pitched ball and square the data with the description of the model discussed by Robert Adair in The Physics of Baseball.

To answer the most frequently asked question thus far - no, I haven't looked at Tim Wakefield in any depth. I did see, however, that his average pitch (and I have 346 to look at) lost exactly 10% of its velocity. Overall though that percentage decrease is in line with the following chart (a version of this chart is also in the original article) since his average pFX (which is a measure of the break of the pitch) was 8.6 and his average start speed was 68.5 miles per hour as shown in the chart below.



However, that percentage does not seem to differ by the break length (a different measure of break introduced this year) nor the pFX value. It's also interesting to note that all but one pitch came out of his hand at less than 79 miles per hour. I think it's likely that the Magnus force placed on a knuckler as it moves in various directions tends to slow it down more than one would other think based on the slow speed and lack of spin.

Monday, May 28, 2007

The 100 RBI Men

A reader makes the following observation:

"Carlos Delgado is currently on pace to get 100 RBIs for the Mets this year, despite sporting an abysmal .234 BA, 306 OBA and 359 SL%. This helps to make the case that RBIs are as much a team stat as an individual one, as the fellas hitting in front of Delgado (Jose Reyes, Carlos Beltran, and David Wright) are pretty adept at getting on base, which allows Delgado to make outs 7 times out of ten and still accumulate a decent number of RBIs. Simply put, he is getting a boatload of opportunities. Delgado also is lucky enough to bat cleanup on a team that is doing very well in the standings, so the manager is unlikely to move him out of the cleanup slot anytime soon."

Well, assuming Delgago plays in 144 games this season as he did in both 2005 and 2006 he'll wind up with 91 RBI and won't break the 100 mark. And as of tonight Delgado's OPS is 665 and taking 2006 league norms for league OPS and the park effect of Shea Stadium that means that Delgado is on pace to have a league normalized and park adjusted OPS of 92. In any case, that doesn't directly bear on the question which follows...

"So, anyways, my question: Who are the worst hitters in MLB history to get 100 RBIs in a season in MLB? And what are their stories? Were these players in similar situations to Delgado, or do they have some other tale to tell?"

To answer the first part and take a crack at the second, here are the "top" 50 players with 100 or more RBI in a single season with the lowest normalized and park adjusted OPS. There are 1,543 players with 100 or more RBI in a single season since beginning in 1901.


Name Year PA G RBI OPS NOPS/PF
Joe Carter 1997 668 157 102 683 89
Vinny Castilla 1999 674 158 102 809 92
Ruben Sierra 1993 692 158 101 678 93
Tony Armas 1983 613 145 107 707 94
Paul O'Neill 2000 628 142 100 760 94
Ray Pepper 1934 598 148 101 732 94
Marv Owen 1936 655 154 105 750 96
Joe Carter 1996 682 157 107 782 96
Glenn Wright 1927 626 143 105 716 96
Joe Carter 1990 697 162 115 681 97
Travis Fryman 1996 688 157 100 766 97
Joe Randa 2000 665 158 106 781 97
Jeff Francoeur 2006 686 162 103 742 98
Jeff Cirillo 2000 684 157 115 869 98
Tony Batista 2004 650 157 110 728 98
Torii Hunter 2003 642 154 102 762 99
Ray Jablonski 1953 640 157 112 735 99
Joe Pepitone 1964 647 160 100 698 100
Ernie Banks 1969 629 155 106 725 100
Carlos Beltran 1999 723 156 108 791 100
Bill Buckner 1986 681 153 102 733 100
Bill Brubaker 1936 620 145 102 736 101
George Bell 1992 670 155 112 712 101
Garret Anderson 2001 704 161 123 792 101
Travis Fryman 1997 657 154 102 766 101
Andres Galarraga 1995 604 143 106 842 101
Chili Davis 1993 645 153 112 767 101
Eddie Robinson 1953 685 156 102 735 101
Wally Pipp 1923 634 144 108 749 101
Willie McGee 1987 652 153 105 746 101
Bing Miller 1930 654 154 100 795 101
Pinky Higgins 1938 603 139 106 794 101
Butch Hobson 1977 637 159 112 789 101
Andruw Jones 2001 693 161 104 772 101
George Kelly 1929 632 147 103 760 101
Gee Walker 1939 645 149 111 773 101
Moose Solters 1936 676 152 134 802 101
Ed Sprague 1996 670 159 101 821 101
Al Simmons 1924 644 152 102 774 102
Ruben Sierra 1987 696 158 109 771 102
Billy Rogell 1934 679 154 100 766 102
Richie Sexson 1999 525 134 116 818 102
Vernon Wells 2002 648 159 100 762 102
Pinky Whitney 1930 662 149 117 849 102
Pinky Whitney 1928 636 151 103 768 102
Matt Williams 1997 636 151 105 795 102
Glenn Wright 1924 662 153 111 744 102


I know many of you had an inkling that Joe Carter would take the top spot. He also appears at numbers 8 and 10 (and number 53 for his 1987, 104 for his 1989 season, 110 for his 1993 season, and number 146 for his 1994 season...you get the idea). But given the poor light in which RBI have been cast in recent years, perhaps surprisingly only 17 times (a little over 1%) has a player ever driven in 100 runs while not being at least league average. So while getting to 100 RBI doesn't ensure that the hitter is an elite offensive performer, it is a pretty good proxy. Put in another way, one needn't be a great hitter to accrue 100 RBI but great hitters often get to 100 RBI. And so in the absence of better metrics it's not surprising that 100 RBI became shorthand for a great offensive performance. This is illustrated by the fact that the "average" 100 RBI man had NOPS/PF of 125 and the histogram belows which shows their distribution:



Contributing to the idea that RBI equals greatness is the ongoing debate over the significance and prevalence of clutch hitting. A player with alot of RBI is often automatically assumed to be a clutch performer as Joe Carter was.

That said, given that we now have much more granular means (with OPS actually being on the lower end) of estimating the run contribution of individual hitters, that usage should wane some although it may take generational turnover to bring about its demise. For a little deeper perspective on traditional and more modern methods of gauging a player's contribution see chapter 1 of Baseball Between the Numbers.

But in getting back to the question at hand, perusing the list you see a few factors that certainly play into reaching the century mark:

  • Performance - As mentioned above there is no doubt that in large part getting to 100 RBI requires a strong performance. From the graph above (the pink cumulative line that uses the y-axis on the right) you can see that fully 75% of those who have driven in 100 runs were 15% or more above league average and 60% were 25% or more above average.


  • Park - Vinny Castilla and Jeff Cirillo in the top 20 show that playing in a park where lots of runs are scored certainly helps, and of course by adjusting for park we don't give them any benefit


  • Era - Twelve of the top 20 players either played in the 1930s or since 1993 which were the two highest scoring eras in modern baseball history. Just like playing in a park where runs are more plentiful allows lesser hitters to drive in more runs, playing in an expanding offensive environment devalues the 100-RBI mark.


  • Teammates - Certainly the reader makes a good point about teammates having to be on the bases. You could probably make a case that Paul O'Neil in 2000 with the Yankees, Marv Owen for the Tigers in 1936, Glenn Wright with the Pirates in 1927, and Bill Buckner with the 1986 Red Sox all fall into this category where the individual was part of a strong offensive team from top to bottom.


  • Lineup Position - It probably comes as no surprise that many of the players on this list and that accrue 100 RBI generally are middle-of-the-order hitters. It probably comes as a bit more of surprise that, as shown in the graph below, the number three position in order actually hits with relatively fewer runners on base than does any other lineups positions save the leadoff and second spots in the order.



  • Plate Appearances - More generally the latter two contributing factors as well as this one fall into the category of opportunity. A player has to come to the plate often enough to reach the 100 RBI mark. Probably no one in this list better exemplifies these is Ruben Sierra's 1993 performance with the Oakland A's. In that season Rickey Henderson hit leadoff for half the year and he hit third in an AL lineup which increases the opportunities for the third hitter and racked up almost 700 plate appearances. On the average, the 100 RBI men had 651 plate appearances. Yes Rudy York did drive in 103 runs in just 417 plate appearances for the 1937 Tigers but all told just 213 players (14%) have ever driven in 100 runs while not coming to the plate at least 600 times.


  • So will Carlos Delgado get to 100 RBI and if so what does it mean? Speaking only in generalties and knowing only that about this performance, we'd have to guess that he was a pretty good hitter. However, that doesn't completely rule out the possibility that his park, era, teammates, lineup position, and playing time all conspired to his breaking the 100 RBI barrier.

    Thursday, May 24, 2007

    Deep Data Dive

    As promised yesterday my column on Baseball Prospectus this morning dives deeper into the PITCHf/x data tracked by the 2007 version of Gameday (a new update was released on May 10th and is much more performant).

    In this article I take a look at the velocity and location data that includes over 40,000 pitches and discover that given a one-inch margin of error the system agrees with umpires to the tune of 90%. Not bad and very similar to the QuesTec results published by Robert Adair in an article titled "Cameras and Computers, or Umpires?" that was published in Volume 32 of SABR's The Baseball Research Journal.

    Wednesday, May 23, 2007

    A Little Light Reading

    Just a few things to chew on...

  • Interesting article by Paul Swydan on the Rockies use of information technology. There is more in the Rockies magazine that they sell at the ballpark with some excellent photos of the custom software they use.


  • Interesting quotes from Tony LaRussa on scoring in multiple innings increasing the probability of winning. As Phil points out, however, simply scoring more runs also increases the probability of winning dramatically and so the quesiton is really which is the more fundamental thing to strive for - scoring in multiple innings or putting together an offense that can score more runs in the aggregate. It seems to me that the former is just an effect or outgrowth of the latter.


  • Here are seven reasons why sabermetrics helps build a better ballgame by Nate Silver.


  • An excellent and succint analysis of this April's weather and it's impact on run scoring. It's all about the temperature. I'll have more to say about this in my Baseball Prospectus column tomorrow with regards to pitch velocity.


  • In my last chat I said:
    A couple weeks ago I overheard one beat writer mention repeatedly that so-and-so was a "four-A" player indicating that regardless of what he's doing in AAA (this guy has an OPS of 1.25 thus far and a pretty good track record) he'll never make it in the majors. One of the things that performance analysis has revealed is that for the most part there is no magical line between the minors and majors as evidenced by a certain measure of predictability as players cross that threshold. To me, the way we as fans and the industry itself views the minors versus the majors in terms of attention, compensation, media exposure etc. is responsible for that view.

    And the so-and-so was...this guy.


  • I had never seen this before but the British Library has a wonderful way of viewing some of their material. They also have a new beta that uses Microsoft Silverlight.
  • Thursday, May 17, 2007

    The Dismal Science?


    My column this morning on Baseball Prospectus is actually a review of the book The Baseball Economist: The Real Game Exposed by J.C. Bradbury. Like many others I've enjoyed Bradbury's commentary on his site Sabernomics.

    In short, I recommend the book and really appreciated both the wide range of sabermetric studies but more so how they're couched in economic concepts. That chapters on performance enhancing substances and baseball's monopoly power are especially strong. I did have a few quibbles with some of the studies but you can read more about that in the column...

    Wednesday, May 16, 2007

    Mad Dog

    Greg Maddux is off to a nice start in 2007 with a 3-2 record and 3.20 ERA in 50 2/3 innings having given up only 45 hits and 7 walks. It turns out that of his eight starts this season five have been recorded more or less completely by the Enhanced GameDay system. In all, that's 358 pitches, 164 to right-handers and 194 to left-handers. Just fiddling around with some of the data today here are some random observations.

  • The fastest pitch leaving his hand was 94 mph. He threw the pitch to Garrett Atkins on April 6th in the top of the third inning. The pitch was called a ball. That pitch was also the fastest when it reached the plate at 83 mph.


  • The slowest pitch he threw was 71.5 mph to Kazuo Matsui in the 6th inning of that April 6th start. That pitch crossed the plate at 62.6 mph before Matsui hit a ground ball to third for an infield single.


  • His average velocity out of the hand was 85 mph and crossing the plate was 75.2 mph. But what's most amazing is that the standard deviation of his muzzle velocity was just 2.86 mph. By comparison, of pitchers who have thrown 100 or more pitches in 2007 with GameDay watching his pitches have varied the least. Jason Schmidt was close at 2.95 mph and on the other end Randy Wolf was at 8.85 mph.


  • The breakdown of the outcomes of his pitches were:
    Ball                         102   28.5%
    Called Strike 79 22.1%
    In play, out(s) 61 17.0%
    Foul 50 14.0%
    Swinging Strike 24 6.7%
    In play, no out 19 5.3%
    In play, run(s) 6 1.7%
    Foul Bunt 5 1.4%
    Pitchout 4 1.1%
    Foul (Runner Going) 3 0.8%
    Ball In Dirt 2 0.6%
    Swinging Strike (Blocked 2 0.6%
    Missed Bunt 1 0.3%
    358
    Not Surprisingly he doesn't get many swinging strikes.


  • The break down of his pitches against lefties and righties is shown in the two graphs below (pictured from the perspective of the pitcher).


    Against lefties he clearly stays on the outer half and besides how few pitches he leaves in the middle of the plate, it would appear he gets a pretty good number of calls on balls that are actually outside the strike zone. Cory Schwartz was kind enough to answer a few questions on the system over at The Book blog recently and said that through testing they're confident that the tracking is within 2 inches with regards to a pitcher's release point and within 1" as the ball crosses the plate. Even with a 1" margin of error and remembering that the data points I'm using are much smaller than an actual baseball, that's still a fair number of pitches that Maddux seems to get the benefit of the doubt on.



    Against righthanders he seems to catch more of the plate and interestingly doesn't seem to pitch as much down in the zone.

  • Tuesday, May 15, 2007

    The Gospel of Pronation

    At last year's SABR convention I had the opportunity to listen to Mike Marshall, former Cy Young winner and PhD, preach the gospel of injury avoidance through his rather unique pitching mechanics and instruction. At the time I noted how he's not taken seriously by the industry and now Jeff Passan over at Yahoo Sports has written an excellent article on Marshall (or "Doc" as his disciples call him) and his Pitching Research and Training Center in Florida north of Tampa. What's particularly interesting is the accompanying video of Marshall protege Jeff Sparks who once pitched for Tampa Bay and who still travels to Marshall's complex in hopes of catching on somewhere.

    In the video you'll see the extreme pronation of the wrist in both the fastball and curveball motions as well the screwball which Marshall preaches will effectively eliminate elbow injuries requiring Tommy John surgery. The video is high speed and so shows the motion pretty clearly. You'll also want to check out Passan being interviewed by Will Carroll on BP Radio from last weekend.

    In the end it would be interesting if a team would send a few borderline pitchers Marshall's way in order to see if there's anything to his claims about injury avoidance and additional velocity. Those who attend his center are certainly true believers but the rest of us won't be until there are some documented successes.

    Monday, May 14, 2007

    Charting Sammy

    Along with the Mother's Day festivities I had a fun weekend collecting and analyzing Enhanced GameDay data. What that allows is for analysis of individual pitchers and hitters. For example, below is a graphic (click to make larger) of all 340 pitches that have been tracked for Sammy Sosa this season. Since Texas is one of the parks in which the system is running and since others where it has been running the longest include Chicago (AL), Oakland, Seattle, and Anaheim, the Rangers tend to have more of their games being tracked. New parks will be added as the season goes along and all told there are now 13 parks in which the system has collected some data with Detroit, St. Louis, Washington, and Colorado (yeah!) being add in the last few days.

    The strike zone that is drawn is an average of the strike zone for each plate appearance. Obviously these graphs can be broken down further by count, pitch type, and especially handedness of the pitcher but even so here you can see that when Sammy chases pitches they are often down and away and fouls off a fair number of pitches that inside.

    In any case, look for more of this in the future here and on Baseball Prospectus.

    Saturday, May 12, 2007

    The Proper Goal

    A well-reasoned article by George Will on Barry Bonds and his upcoming "achievement". He covers all the relevant bases in discussing why it is that baseball seems to be held to a higher standard than other sports, a little of their history in sports other than baseball, the difficulty in drawing the line, and the confluence of other factors in baseball that makes it difficult to detect performance enhancers, and of course the kind of circumstantial evidence that I discussed in my 2005 article on the heels of Bonds 704th homerun.

    But my favorite aspect I think is the reminder that Will provides as to just why it is that performance enhancing substances are bad for the game - actually any game. I've often heard talk show hosts and read columnists that argue that players have a right to do whatever it takes to feed their family etc. etc. and who are we to judge and why should we or they care? As an antidote to that kind of thinking Will offers the following:

    Athletes who are chemically propelled to victory do not merely overvalue winning, they misunderstand why winning is properly valued. Professional athletes stand at an apex of achievement, but their achievements are admirable primarily because they are the products of a lonely submission to a sustained discipline of exertion. Such submission is a manifestation of good character. The athlete's proper goal is to perform unusually well, not unnaturally well. Drugs that make sport exotic, by radical intrusions into the body, drain sport of its exemplary power by making it a display of chemistry rather than character. In fact, it becomes a display of some chemists' virtuosity and some athletes' bad character.

    Thursday, May 10, 2007

    Pitch by Pitch

    In my column today on Baseball Prospectus I take a pitch by pitch look at the almost no-hitter of Yankees rookie Phillip Hughes on May 1st using the enhanced GameDay data provided by MLBAM. Joe P. Sheehan over at Baseball Analysts has pioneered alot of the analysis and so you'll want to check out his work as well.

    To me, the interesting aspect of this is in how it will certainly be combined with bio mechanical analysis in order to provide quantitative support for what is today primarily observational. A synthesis of the two worlds will offer insights that neither could provide independently and that's exciting stuff.

    Update: My article actually was published after this one by John Beamer at THT where he looks at the consistency of the data between parks. Obviously, if I had seen this I would have mentioned it but interestingly, his research on Kevin Millwood, who has pitched at five of the eight parks in which the Enhanced GameDay system is installed, suggests that there may be some park bias in play. Particularly he sees this in speed and location (for example location of release point). Both of course become much more useful when you can compare them from game to game. If indeed the system has built-in biases, either adjustments will have to be made internal to the system (if possible since a variable like mound height may be a contributing factor as may atmospheric conditions although Beamer finds that unlikely) or those using the data will have to make the adjustments themselves. The second option is certainly not appealing.

    Saturday, May 05, 2007

    Lightning


    In the spring of 2004 my wife and eldest daughter had the chance to travel to England so that I could attend a conference. Since this was our first visit we planned to pack in as much as we could during the trip and in addition to a quick tour of London that included the British Museum, Westminster Abbey, Trafalgar Square, we headed up to the Yorkshire Dales where the real James Herriott (Alf Wright) practiced. Before making our way to the conference south of London we swung over to Oxford in order to get a look at where C.S. Lewis lived and worked. On a rainy Saturday, with the help of a local man who knows the pastor, we were able to visit the Anglican church near Lewis' home where he worshipped for all of his adult life and where he was buried in the small church cemetery. Inside the church there are several remembrances of Lewis including a very nice window depicting a scene from Narnia and a marker on the pew where Lewis and his brother Warnie regularly sat. That's my daughter sitting in the seat that Lewis typically occupied.



    All of this came to mind as I was reading Lewis' essay "On Church Music" published in Christian Reflections the other night. The essay attempts to navigate the controversy of "high" versus "low" church music with high meaning more serious music sung by a trained choir and low meaning hymns sung by the congregation. As Lewis often does he sees in both the opportunity for Christians to "humbly and charitably" sacrifice by either indulging the "lusty roar of the congregation" or remaining silent and respectful of that which one doesn't understand. In that way "Church Music will have been a means of grace; not the music they have liked but the music they have disliked." For his part Lewis was more skeptical that any music is very religiously relevant and even in this essay we find one of his famous quotes that "What I, like many other laymen, chiefly desire in church are fewer, better and shorter hymns; especially fewer."

    Given the admonition that at the very least music is a chance at sacrifice and a means of giving grace, I'm somewhat hesitant to proceed. And yet I'll share what I found to be my own interesting reaction to one of the praise songs we sometimes sing in our church. The song is called "Indescribable" by Chris Tomlin, the first two verses and the chorus of which go like so;

    From the highest of heights to the depths of the sea
    Creation's revealing Your majesty
    From the colors of fall to the fragrance of spring
    Every creature unique in the song that it sings
    All exclaiming

    Indescribable, uncontainable,
    You placed the stars in the sky and You know them by name.
    You are amazing God
    All powerful, untamable,
    Awestruck we fall to our knees as we humbly proclaim
    You are amazing God

    Who has told every lightning bolt where it should go
    Or seen heavenly storehouses laden with snow
    Who imagined the sun and gives source to its light
    Yet conceals it to bring us the coolness of night
    None can fathom


    Now, this song is clearly extolling the power and majesty of God and has a melody and cadence that heightens the emotions and from what I observe is clearly one of the favorites of the congregation. That said, each time the second verse beginning with "Who has told every lightning bolt where it should go..." begins I cringe just a little.

    To me, these lyrics that God has an interest in directing individual lightening bolts harkens back to the early 18th century when lightning was viewed as a means of God's displeasure and/or the work of demons which, along with good spirits, were thought to have filled the air. In those days as a storm approached church bells would be rung in order to ward off the bolts as in the words of St. Thomas Aquinas "The tones of the consecrated metal repel the demon and avert storm and lightning". As you can imagine this wasn't an effective strategy and as Walter Isaacson related in Benjamin Franklin: An American Life, "during one thirty-five year period in Germany alone during the mid-1700s alone, 386 churches were struck and over a hundred bell ringers killed." Of course Franklin's invention of the lightning rod in 1752 began to change this way of thinking although some theologians resisted its use fearing that it would be impious to resist the hand and judgment of God. In one particularly tragic event over 3,000 people were killed in 1767, some fifteen years after Franklin's invention when the church of San Nazaro in Venice was struck igniting gunpowder being stored in the church.

    What I find interesting in all of this is that in the praise song lightning is viewed as just another display of God's creativity and power along with the flowers and stars. And yet this is a power that has been tamed by the intervention of man and so in the song we can stand back and admire it without fear of consequences or judgement. While congregations 300 years ago may indeed have also looked at lightning as a display of God's power, they would additionally have looked at it as an instrument of God's judgement. The mention of lightning in a hymn would have conjured up far different notions to them than it does to us. One wonders whether including the other sentiments expressed in Tomlin's song would even have seemed appropriate. Beyond that it seems just silly to praise God in worship songs for directing lightning bolts when we do our darnedest to intercept and redirect them whenever possible. What if the word lightning in the song were replaced with "tornado"? Would we really sing "Who has told every tornado where it should go..."? I just don't think most modern Christians think God uses natural events to punish people and so I find it somewhat surprising that the concept is so blatant in a song that I've heard sung in more than one evangelical church in the last decade. Unless I'm wrong one would hope church leaders would do a better job of ensuring that what is sung and said in the service lines up with current Christian belief. To that end, I wonder what seekers attending services think when they see lyrics like this?

    The point was also hit home a few weeks ago when we also had a guest worship leader who sang a song he had written that included the line "to the God of lightning." Before the song he relayed the context of its writing which included sitting with his eight-year old daughter on the back porch watching the thunderstorms roll over the eastern Colorado plains. His daughter was awed by the display and before heading to bed asked to linger and then prayed that God would send another blast of lightning and thunder. Again, 300 years ago that would have been unfathomable.

    To me, remembering the terror and destruction that lightning has caused along with how the church viewed it historically, I find it anachronistic and intellectually vacuous to sing praise songs in which we in effect blame God for a natural phenomenon.

    Friday, May 04, 2007

    The Song of the Dodo


    Two of the genres that are among my favorites are natural history and travelogues. So it's no surprise that I found The Song of the Dodo: Island Biogeography in an Age of Extinctions by David Quammen one of the best books I've read in a while. Originally published in 1996 it retains the status of a classic of sorts having won several literary awards and remains in print and on store shelves in a newer 2004 printing that I picked up a couple months ago.

    Basically, the book traces the intellectual pedigree of the field of island biogeography starting with Alfred Russell Wallace, a first rate naturalist in his own right, and more famously the "co-discoverer" of evolution (although Quammen has a little axe to grind on that score) with Charles Darwin. From there Quammen traces the development of the field through the 20th century which really picks up momentum in the 1960s with the work of Robert MacArthur and Edward O. Wilson. He brings it up to the present day as the field morphs from being primarily descriptive (what organisms live on what islands) to a theoretical and quantitative one (why are those animals found in their distributions on those islands) to the field's application beyond islands to the problem of habitat fragmentation and its consequences for the future.

    While that may sound a little dry and challenging to some readers, Quammen makes the intellectual journey an enjoyable one as he describes the theories, papers, and studies in the context of his visits to the famous and not so famous islands and locales of habitat fragmentation. From traipsing through Brazil to get a look at species of New World monkey (Brachyteles arachnoides) isolated in rainforest bordered on all sides by clear cut where he helps collect droppings, to the Galapagos, Mauritius (once home to the Dodo, Raphus cucullatus, whose song if it had one is now lost, hence the title of the book), Guam, Hawaii, Madagascar, Tasmania (and his search for the extinct? Tasmanian wolf (Thylacinus cynocephalus), Aru in the Malay archipelago to view the birds of paradise (Cicinnurus regius and Paradisaea apoda) which was Russell's last stop before heading home, to the islands Komodo where he relates the sometimes dangerous coexistence of people and dragons (Varanus komodoensis), Quammen makes each site come alive while illustrating some theme in the larger intellectual story he's telling. Although not the main story he's telling, I've always been fascinated by the stories of human "first contact" with island species and how that contact typically decimated the species through what Quammen calls their "ecological naivete" evidenced by their shortage of defensive adaptations that creatures on larger land masses must evolve in order to survive. In part I liked the book because he does a good job of documenting many of these cases and giving us a sense of what's been lost.

    But the main story line is also a fascinating one that at its core relates how the fundamental ideas of species distribution on islands were slowly formulated in the twentieth century. In particular Quammen points to the distillation of 100 years of observation in a 1967 book titled The Theory of Island Biogeography by MacArthur and Wilson that describes the area and distance effects. In epitome the area effect shows itself in that small islands tend to harbor fewer species than large islands with the underlying causes being fewer immigrations and more extinctions (due to what he refers to as stochastic factors or seemingly random occurrences that can decimate a smaller population such as drought, fire, or volcanic eruption) with the area of the island being determinative of the quantities. The distance effect shows itself in the fact that more remote islands are the home to fewer species than islands closer to the mainland since there are fewer immigrations. These ideas then lead to the so-called "species-area relationship" that attempts to quantify these effects through an equation (known as the species-area equation). Biogeography had matured into a full theoretical and quantitative science.

    The really interesting part however, is when these ideas began to be applied not just to islands but other ecosystems as they were in a seminal paper by Jared Diamond in the mid 1970s. The implication of this application is simply that as habitats are fragmented they may end up supporting fewer species than larger contiguous blocks. In other words, although efforts to preserve species through the creation of parks and reserves is noble, their creation may end supporting fewer species than we think due to the inability of species to migrate and "extinction" of the local population due to these random events which probabalistically have more impact than they would if the populations were larger. As a hypothetical example, assume you were to survey your lawn and count the number of species of organisms that inhabit an average square yard. If you then cut out one of those squares and isolated it from the rest of yard, the idea is that if you surveyed it again later it would contain fewer species because of the area and distance effects. Quammen then details the raging scientific debate that ensued regarding not only if the application of biogeography to the design of preserves was appropriate but if so, just how large preserves should be to maintain what would become known as the "minimum viable population" or the number of animals required in order for a population to survive in the long-term, however that's defined.

    Quammen interviews many of the participants, or should I say characters since several are quirky to say the least, in the debate on all sides but certainly seems to take the view that at least in the larger perspective, attempts to connect habitats and enlarge nature preserves are what is needed to avoid the problems inherent in fragmentation.

    The ideas and the debate fascinating but Quammen also brings to them a life in a way that allows the reader to not only understand the concepts but think deeply about their implications. For my money you can't ask more of a book than that.

    Keith Woolner Gets the Call

    Congratulations are in order to Keith Woolner, a BP colleague and the inventor of VORP, who has taken the position of Manager of Baseball Research and Analysis with the Cleveland Indians. The quote by Indians GM Mark Shapiro that I used in my Hope and Faith piece on the Indians in March is clearly borne out by this move and I'm sure that Keith will be a key contributor to their future success. Congrats!

    Thursday, May 03, 2007

    MLBAM and Silverlight

    Maury Brown, our guru of all things related to the business of baseball at Baseball Prospectus, has many times documented the rise of Major League Baseball Advanced Media (MLBAM) and how it has not only made baseball available in a very rich way but also added to the revenue stream of the owners. From it's first attempts at streaming games in September of 2002, all the while aided by the relentless march of technology, MLB.com has grown to where it will host over three billion visitors and by the end of the year will crack the one million subscriber count for multimedia content.

    I don't want to steal Maury's thunder but I had just had to pass along this demonstration of MLBAM's use of a new technology called Microsoft Silverlight. In a nutshell Silverlight is a cross-browser, cross-platform plug-in for delivering media experiences and interactive applications for the web. For software architects and developers heavily invested in the Microsoft platform (like myself) this is exciting since it allows development across platforms using the .NET Framework that we're also using to build our business applications.

    In any case, MLBAM's President Bob Bowman and Justin Shaffer who is in charge of new media joined Microsoft at their MIX07 conference in Las Vegas earlier this week to demo how they'll be using Silverlight in the very near future. Essentially, Silverlight will be rolled into the MLB.TV applications later this summer and will enable a host of new features. Some of those demoed include:

  • More Bandwith: 1.5 Mbps streaming whereas today they stream at 400K and 700K

  • Composite Video: Overlays of the video stream that offer controls that allow you to manage the video or view enhanced information. For example, the data that you see today in Gameday will be made available via widgets that rest semi-transparently on top of the video in an unobtrusive form providing the wealth of additional information that MLBAM is collecting. For folks like me, this is a true value-add as it creates new ways of watching the game by integrating that information that we crave.

  • Chatting: In the same vein a chat widget will be available that allows for conversations with friends and fans of rival teams.

  • Player Tracker: The Player Tracker used in Mosaic will be available with widgets that overlay the video that provide real time updates on your players and can then make available highlights of your players.

  • Picture in Picture: They showed a nice feature where a "friend" can send a video clip which can then be viewed in "picture in a picture" mode while the main stream continues to play.

  • Resizing: One of the features that Silverlight enables is dynamic resizing to allow the user to check on other scores, stats, and content and resize the video window on the fly accordingly.

  • Mobility: And finally, they showed a forward looking demo that had Gameday running on a Windows mobile phone with the player tracker feature playing video clips.


  • As an MLB.TV subscriber this is some exciting stuff and for those who aren't subscribers it provides some additional impetus to get signed up.

    Quick Workers and Human Rain Delays Followup

    My column this morning on Baseball Prospectus is a follow up to a post last week on fast and slow workers. Specifically, I was interested in the question of whether pitchers who work more quickly reduce the number of errors committed behind them as the common wisdom would indicate.

    Although it's difficult to identify quantitatively which pitchers are sloths (the slow ones like Steve Trachsel) and which are humingbirds (faster workers like Bob Gibson) my attempt using anecdotal evidence couldn't find any statistically significant difference between two groups of 10 pitchers encompassing over 40,000 innings pitched since 1970. At first glance it was the sloths who seemed to suppress the number of errors. However, pitchers who worked faster in my sample were more likely to be ground ball pitchers and so that fact had to be corrected for since groundballs are more likely to produce errors than fly balls or line drives. I also used a subset of the two groups whose performance was almost equivalent to remove the bias that good pitchers introduce by being able to suppress errrors and unearned runs.

    Tuesday, May 01, 2007

    Stretching 2006

    Last week in my Schrodinger's Bat column on Baseball Prospectus I discussed the frequency with which runners had been thrown out stretching singles into doubles and doubles into triples. I've since posted an update on BP's Unfiltered blog that includes stretching triples into inside-the-park homeruns and today I offer the complete list of those runners who were thrown out at least once stretching in 2006. As a measure of opportunity I also include the number of singles, doubles, and triples the player hit in 2006 and compute a simple rate. For players with 100 or more hits Bengie Molina was thrown out 4 times good for once every 26 hits. Yes, this is the same Molina who I've rated as the third worst baserunner in the aggregate since 2000.


    Name S+D+T BX2 BX3 BXH Total OA Rate
    Tomas de la Rosa 5 1 0 0 1 5.0
    Ryan Sweeney 8 1 0 0 1 8.0
    Andy Pettitte 11 0 1 0 1 11.0
    Casey Kotchman 11 1 0 0 1 11.0
    Scott Thorman 25 2 0 0 2 12.5
    Orlando Palmeiro 30 1 1 0 2 15.0
    Chad Moeller 16 1 0 0 1 16.0
    Chris Iannetta 18 1 0 0 1 18.0
    Todd Linden 19 1 0 0 1 19.0
    Adam Lind 20 0 1 0 1 20.0
    Danny Ardoin 22 1 0 0 1 22.0
    Hector Luna 94 2 2 0 4 23.5
    Cody Ross 48 2 0 0 2 24.0
    Bengie Molina 104 4 0 0 4 26.0
    Jeff Cirillo 81 2 1 0 3 27.0
    Corey Koskie 55 2 0 0 2 27.5
    Vladimir Guerrero 167 5 1 0 6 27.8
    Jeromy Burnitz 56 2 0 0 2 28.0
    Doug Mientkiewicz 85 1 2 0 3 28.3
    Sandy Alomar 29 1 0 0 1 29.0
    Stephen Drew 61 2 0 0 2 30.5
    Willie Bloomquist 61 2 0 0 2 30.5
    Freddie Bynum 31 1 0 0 1 31.0
    Willy Aybar 64 1 1 0 2 32.0
    Geoff Blum 66 2 0 0 2 33.0
    Marlon Anderson 71 2 0 0 2 35.5
    Sal Fasano 36 1 0 0 1 36.0
    Eduardo Perez 38 1 0 0 1 38.0
    Ryan Freel 115 3 0 0 3 38.3
    Ramon Hernandez 115 2 1 0 3 38.3
    Gary Sheffield 39 1 0 0 1 39.0
    Orlando Cabrera 162 2 2 0 4 40.5
    Shin-Soo Choo 41 0 1 0 1 41.0
    Lance Berkman 124 3 0 0 3 41.3
    Omar Vizquel 167 3 1 0 4 41.8
    Alfredo Amezaga 84 2 0 0 2 42.0
    Jose Valentin 86 1 0 1 2 43.0
    Placido Polanco 132 2 1 0 3 44.0
    Esteban German 88 0 2 0 2 44.0
    Lance Niekro 44 1 0 0 1 44.0
    Marcus Giles 133 3 0 0 3 44.3
    Aramis Ramirez 135 2 1 0 3 45.0
    Pat Burrell 90 2 0 0 2 45.0
    Mike Napoli 45 1 0 0 1 45.0
    Emil Brown 136 2 1 0 3 45.3
    Mike Piazza 91 2 0 0 2 45.5
    Marco Scutaro 92 2 0 0 2 46.0
    Rickie Weeks 92 2 0 0 2 46.0
    Hank Blalock 141 3 0 0 3 47.0
    Ryan Garko 47 1 0 0 1 47.0
    Mike Sweeney 48 1 0 0 1 48.0
    Brian Giles 145 3 0 0 3 48.3
    Travis Hafner 98 2 0 0 2 49.0
    Ryan Shealy 49 1 0 0 1 49.0
    Reggie Abercrombie 49 1 0 0 1 49.0
    Yuniesky Betancourt 153 2 1 0 3 51.0
    Kevin Millar 102 2 0 0 2 51.0
    Justin Morneau 156 3 0 0 3 52.0
    Ben Broussard 104 1 1 0 2 52.0
    Andy Phillips 52 1 0 0 1 52.0
    David Ortiz 106 2 0 0 2 53.0
    Jose Lopez 160 3 0 0 3 53.3
    Andruw Jones 107 2 0 0 2 53.5
    Russell Martin 107 1 1 0 2 53.5
    Kevin Mench 107 2 0 0 2 53.5
    Julio Lugo 109 1 1 0 2 54.5
    Craig Monroe 110 2 0 0 2 55.0
    Hanley Ramirez 168 2 1 0 3 56.0
    Brad Wilkerson 56 1 0 0 1 56.0
    Johnny Estrada 114 2 0 0 2 57.0
    Craig Biggio 114 1 1 0 2 57.0
    Brandon Fahey 57 0 1 0 1 57.0
    Henry Blanco 58 1 0 0 1 58.0
    Tony Graffanino 118 1 1 0 2 59.0
    Carl Everett 59 1 0 0 1 59.0
    Adam Kennedy 119 2 0 0 2 59.5
    J.D. Drew 120 1 0 1 2 60.0
    Pablo Ozuna 60 0 1 0 1 60.0
    Melky Cabrera 122 2 0 0 2 61.0
    Richie Sexson 122 2 0 0 2 61.0
    Brian McCann 123 2 0 0 2 61.5
    Ryan Howard 124 1 1 0 2 62.0
    Torii Hunter 124 2 0 0 2 62.0
    Kazuo Matsui 62 1 0 0 1 62.0
    Dioner Navarro 62 1 0 0 1 62.0
    Eric Hinske 62 1 0 0 1 62.0
    Marcus Thames 63 1 0 0 1 63.0
    Jose Vidro 127 2 0 0 2 63.5
    Jason Bay 128 2 0 0 2 64.0
    David Dellucci 64 1 0 0 1 64.0
    Freddy Sanchez 194 3 0 0 3 64.7
    Bobby Kielty 65 1 0 0 1 65.0
    Mark Kotsay 131 2 0 0 2 65.5
    Brandon Phillips 131 2 0 0 2 65.5
    Alex Rodriguez 131 2 0 0 2 65.5
    Shawn Green 132 2 0 0 2 66.0
    Scott Rolen 132 1 1 0 2 66.0
    Abraham Nunez 66 0 1 0 1 66.0
    Travis Lee 66 1 0 0 1 66.0
    Jhonny Peralta 133 2 0 0 2 66.5
    Mike Cuddyer 134 2 0 0 2 67.0
    Gregg Zaun 67 1 0 0 1 67.0
    Reed Johnson 135 2 0 0 2 67.5
    David DeJesus 137 2 0 0 2 68.5
    Rob Mackowiak 69 1 0 0 1 69.0
    Wes Helms 69 1 0 0 1 69.0
    Jeff Francoeur 140 2 0 0 2 70.0
    Cliff Floyd 70 1 0 0 1 70.0
    Greg Norton 70 1 0 0 1 70.0
    Adrian Beltre 141 2 0 0 2 70.5
    Jim Edmonds 71 1 0 0 1 71.0
    Mark Teixeira 144 1 1 0 2 72.0
    Willy Taveras 146 0 2 0 2 73.0
    Matt Stairs 73 1 0 0 1 73.0
    Josh Bard 74 1 0 0 1 74.0
    Brian N. Anderson 74 1 0 0 1 74.0
    Todd Helton 150 2 0 0 2 75.0
    Ivan Rodriguez 151 2 0 0 2 75.5
    Chone Figgins 152 1 1 0 2 76.0
    Magglio Ordonez 153 2 0 0 2 76.5
    Carlos Guillen 155 1 1 0 2 77.5
    Chris Duffy 78 1 0 0 1 78.0
    Javy Lopez 78 1 0 0 1 78.0
    John Buck 80 1 0 0 1 80.0
    Wilson Betemit 80 1 0 0 1 80.0
    Jay Gibbons 82 1 0 0 1 82.0
    Victor Martinez 165 2 0 0 2 82.5
    Luis Castillo 170 2 0 0 2 85.0
    Khalil Greene 86 1 0 0 1 86.0
    Frank Thomas 87 1 0 0 1 87.0
    Gary Matthews 175 1 1 0 2 87.5
    Jose Reyes 175 1 1 0 2 87.5
    Mark Loretta 176 2 0 0 2 88.0
    Matt Diaz 90 1 0 0 1 90.0
    Joey Gathright 90 1 0 0 1 90.0
    Chris Burke 92 1 0 0 1 92.0
    Casey Blake 94 1 0 0 1 94.0
    Trot Nixon 94 1 0 0 1 94.0
    Miguel Tejada 190 2 0 0 2 95.0
    Miguel Olivo 97 1 0 0 1 97.0
    Ty Wigginton 98 1 0 0 1 98.0
    Clint Barmes 98 0 1 0 1 98.0
    Carlos Beltran 99 0 1 0 1 99.0
    Mark Ellis 99 1 0 0 1 99.0
    Aubrey Huff 100 0 1 0 1 100.0
    Cory Sullivan 101 1 0 0 1 101.0
    Brian Schneider 101 1 0 0 1 101.0
    Carlos Delgado 101 1 0 0 1 101.0
    Angel Berroa 102 1 0 0 1 102.0
    Mike Jacobs 103 1 0 0 1 103.0
    Endy Chavez 104 1 0 0 1 104.0
    Brady Clark 105 0 1 0 1 105.0
    Nick Swisher 106 1 0 0 1 106.0
    Ichiro Suzuki 215 2 0 0 2 107.5
    Adam LaRoche 108 1 0 0 1 108.0
    Manny Ramirez 109 1 0 0 1 109.0
    Andre Ethier 111 1 0 0 1 111.0
    Corey Patterson 112 1 0 0 1 112.0
    Josh Willingham 113 1 0 0 1 113.0
    Todd Walker 114 1 0 0 1 114.0
    Xavier Nady 114 1 0 0 1 114.0
    Geoff Jenkins 114 0 1 0 1 114.0
    Preston Wilson 115 1 0 0 1 115.0
    Royce Clayton 115 1 0 0 1 115.0
    Juan Rivera 116 1 0 0 1 116.0
    Jose Castillo 117 1 0 0 1 117.0
    Adam Everett 117 1 0 0 1 117.0
    Austin Kearns 118 1 0 0 1 118.0
    Alexis Rios 119 1 0 0 1 119.0
    Ray Durham 120 1 0 0 1 120.0
    Jeff Conine 121 1 0 0 1 121.0
    Nomar Garciaparra 122 1 0 0 1 122.0
    Eric Byrnes 124 0 1 0 1 124.0
    Frank Catalanotto 124 1 0 0 1 124.0
    Joe Crede 124 1 0 0 1 124.0
    Brad Hawpe 124 0 1 0 1 124.0
    Jacque Jones 125 1 0 0 1 125.0
    Jermaine Dye 126 1 0 0 1 126.0
    Conor Jackson 126 1 0 0 1 126.0
    Shea Hillenbrand 126 1 0 0 1 126.0
    Kenji Johjima 129 1 0 0 1 129.0
    Ronny Paulino 131 1 0 0 1 131.0
    Nick Punto 132 1 0 0 1 132.0
    A.J. Pierzynski 134 1 0 0 1 134.0
    Scott Podsednik 134 0 1 0 1 134.0
    Garret Anderson 135 1 0 0 1 135.0
    Ron Belliard 135 1 0 0 1 135.0
    Mark DeRosa 141 1 0 0 1 141.0
    Mike Lowell 143 0 1 0 1 143.0
    Dave Roberts 144 1 0 0 1 144.0
    Luis Gonzalez 144 1 0 0 1 144.0
    Dan Uggla 145 0 0 1 1 145.0
    Kevin Youkilis 146 1 0 0 1 146.0
    Bobby Abreu 148 1 0 0 1 148.0
    Adrian Gonzalez 149 1 0 0 1 149.0
    Brian Roberts 151 0 1 0 1 151.0
    Orlando Hudson 151 1 0 0 1 151.0
    Vernon Wells 153 1 0 0 1 153.0
    Aaron Hill 153 1 0 0 1 153.0
    David Wright 155 1 0 0 1 155.0
    Paul Lo Duca 158 0 1 0 1 158.0
    Lyle Overbay 159 1 0 0 1 159.0
    Grady Sizemore 162 0 1 0 1 162.0
    Matt Holliday 162 1 0 0 1 162.0
    Carl Crawford 165 0 0 1 1 165.0
    Jimmy Rollins 166 1 0 0 1 166.0
    Miguel Cabrera 169 1 0 0 1 169.0
    Chase Utley 171 1 0 0 1 171.0
    Rafael Furcal 181 0 1 0 1 181.0
    Derek Jeter 200 1 0 0 1 200.0

    It should also be noted that Michael Young through 2006 had 1,010 singles, doubles, and triples and had yet to be thrown out stretching a hit.