Wednesday, August 15, 2007

A Sabermetric Cambrian Explosion

Several folks have alerted me to this article by Nate DiMeo on that talks a bit about PITCHf/x and it's promise. What I like about it is that it does a nice job of showing the range of analysis that has already been done (and I like the quote he used as well from this column) and linking to some of those articles.

I've written 10 articles on the subject with a couple more already in the works which include:

  • Schrodinger's Bat: Putting the Pedal to the Metal - August 16

  • Schrodinger's Bat: Calling the Balls and Strikes - July 26

  • Schrodinger's Bat: Searching for the Gyroball - July 5

  • Schrodinger's Bat: Playing Favorites - June 28

  • Schrodinger's Bat: Gameday Meets the Knuckleball - June 21

  • Schrodinger's Bat: The Science and Art of Building a Better Pitcher Profile - June 14

  • Schrodinger's Bat: Gameday Triple Play - June 7

  • Schrodinger's Bat: Physics on Display - May 31

  • Schrodinger's Bat: Batter Versus Pitcher, Gameday Style - May 4

  • Schrodinger's Bat: Phil Hughes, Pitch by Pitch - May 10

  • Schrodinger's Bat: The Information Revolution - October 26, 2006

  • In addition Dr. Nathan has created a wonderful page that not only has the most complete data dictionary for the PITCHf/x data but also includes a paper he wrote detailing his own analysis of pitch classification using derived parameters of axis of rotation and spin using a sophisticated model.

    Although it may be difficult to detect, one of my goals in researching and writing about the PITCHf/x data this season has been to explore as many avenues of analysis as possible in this early stage when the system is still being tweaked and the data is incomplete. By doing so we can begin to see which of those ideas for analysis are useful and should be developed further as well as to help spur new ideas by other researchers. This is analogous to one of my favorite intellectual ideas, that of the inverted cone of diversity that I also used to help illuminate the evolving way in which players have been used throughout baseball history, and that Stephen Jay Gould was also fond of. From that earlier column the idea is briefly this:

    In 1909 Charles Doolittle Walcott discovered a treasure trove of wonderfully unique fossils preserved in a layer of shale near the town of Field in British Columbia, specimens that would become known simply as the Burgess Shale. While Walcott placed his specimens in familiar phyla that were known to exist during the period (Middle Cambrian, 505 million years ago), it was a reinvestigation by Harry Blackmore Whittington, Derek Briggs, and Simon Conway Morris of the University of Cambridge in the 1980s that upended that traditional interpretation of the fossils' place in the evolution of life. By inverting the familiar iconography of the cone of increasing diversity in life forms, Whittington, Briggs, and Morris reinterpreted the Burgess Shale as replete with creatures in phyla that are now extinct. In other words, rather than life becoming increasingly more diverse in terms of its basic body plans over successive geologic periods, the Burgess Shale records an initial flowering of experimentation in structures just after the dawn of life before a later decimation or winnowing into the few surviving phyla we see today. Stephen Jay Gould devoted an entire book to this theme as an illustration akin to his theory of punctuated equilibrium in his 1989 book Wonderful Life: The Burgess Shale and the Nature of History.

    So with the introduction of PITCHf/x we're in our own kind of sabermetric Cambrian explosion where ideas are flowering and we're looking for those that survive the selection pressures that prevail.

    Where the analogy breaks down however, is that unlike body plans that are almost fully constrained by what went before, ideas never are and so while many of the paths that we'll subsequently travel will come out in the near future, there will always be a decreasing number that are novel and could therefore fundamentally change the way we look at this data.

    Updated 8/16/2007: Added new article on pitch speeds with runners on base.

