Saturday, September 11, 2004

The Quiet Revolution?

With the publication of Moneyball and The Numbers Game sabermetric knowledge (see this post for some of the core conclusions) and research methods have begun to spread to a broader audience than the relatively small number of baseball analysts and stat geeks that have long championed them. Nate Silver in his column on Baseball Prospectus discusses the present and future of sabermetrics and notes the increasing visibility of sabermetrics in front offices and the media. However, Silver also provides the following return to reality:

“Does that mean that the battle has been won? Hardly; it is likely to take a while, perhaps a whole generation, before analysis crosses the chasm between the early adopters and the mainstream. Baseball executives, with some notable exceptions, are older men with backgrounds in scouting and player development. Beat writers, by and large, are a worn lot and neither particularly well-versed in analysis nor particularly interested in learning about it. While the Internet and other forms of new media provide fans with greater discretion in just how they take their baseball, there is also an increasing tendency within the media (this means you, ESPN) to take a Paparazzi-like approach toward their sports coverage…It will be a quiet revolution, and those tend to be slower than the bloody ones.”

Silver goes on to predict that sabermetrics will emerge into an “analytical post-modernism” characterized by the acknowledgment that some questions are out of reach and the realization of the limits of analysis. He hopes that a second trend in post-modernism, what he calls “navel-gazing” or the tendency for individuals to be more concerned with their own position and reputation in the movement can be avoided. He also warns that:

“Movements, whether social, political or intellectual, tend to be unified when they have a lot of work to do and when there is a lot to accomplish, and fractionalized when there is not. If sabermetric thinkers come to believe prematurely that their mission has been accomplished, the infighting is likely to increase, and the movement could set its progress back.”

To the question of what remains, Silver points readers to Keith Woolner’s Baseball Prospectus 2000 article “Baseball’s Hilbert Problems” where, like the mathematician David Hilbert in 1900, he lays out 23 major areas of study for the coming generations. Even so, Silver concludes that:

“It is also probably true that the pace of discovery within sabermetric circles will slow as more and more data is analyzed and more and more conclusions have been proclaimed. Baseball, while a wonderfully complex game, is nevertheless a closed system, and the returns on further research efforts are likely to diminish.”

Great stuff.

I agree with Silver that sabermetric knowledge is making its way into the front offices and media outlets not as the result of the success of risk taking early adopters such as Sandy Alderson (who discovered the importance of on base percentage through the work of Eric Walker in The Sinister First Baseman) but primarily as a result of the old guard moving on and the younger generation (Epstein, DePodesta, et. al.) raised on The Baseball Abstract and The Hidden Game of Baseball, getting an opportunity to apply their theories. This view is akin to how scientific revolutions finally take hold per The Structure of Scientific Revolutions by Thomas Kuhn where Kuhn quotes Max Planck's observation that:

"a new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generations grows up that is familiar with it."

And what this will do in the long run is kill the movement in its current incarnation as a collection of outsiders making the very term "sabermetrics" passé, as it becomes the new orthodoxy.

As far as Silver’s view that sabermetrics in the near term (at least I think he meant the near term) may be emerging into an analytical post-modernism, I’m not sure I agree with that characterization. I would hesitate to call it post-modernism because to me PM’s main tenet is relativism, something sabermetrics certainly eschews with its belief that there are definitive answers to questions if given the correct data. I do, however, agree that as sabermetrics matures it will (whether it expects to or not) collide with areas that may be difficult to analyze using a reductionist approach. Areas that, as Stephen Jay Gould said of human attributes such as religion and “moral sense” are the product of emergence (the whole is not the sum of its parts).

Silver’s final point above is a sentiment that was expressed in another analysts’ blog I recently read who lamented that the pace of sabermetric breakthroughs seems to have slowed to a crawl in recent years. One commenter to the post shared that sabermetrics, because it was finally moving beyond its small circle into a wider world, was emerging into a period of consolidation, explication, and refinement of its main research conclusions over the past quarter century. I think both Silver’s and the commenters points are correct – to an extent.

Baseball largely is what it is (its inherent structure - 27 outs, 9 fielders, 90' and 60' 6" - will remain the same) and so analysis of the game can only proceed along a finite number of lines forcing diminishing returns as research progresses. Much of the low hanging fruit has probably been picked. For example, the conclusion that the avoidance of outs and therefore OBP is important and the relative unimportance of batting average is not likely to be reversed.

Coupled with sabermetrics’ increased visibility this means that job one of the community is now to preach the message to the uninitiated and solidify core conclusions to help sabermetrics attain the status of the new orthodoxy. To illustrate, consider just one example of the lack of awareness in the larger baseball community - the basic conclusions of play-by-play data preached in The Hidden Game of Baseball and encapsulated in a tool like my MLB Pocket Manager calculator. The basic probabilities and the magnitude of the trade-offs involved are currently beyond the view of the average professional manager or ballplayer, let alone fan.

At the same time baseball, like all things, evolves. In that sense I don’t think that baseball is a totally closed system and so there will always be new questions to answer and new angles to research (more refined and perhaps of smaller scope but still there none the less). Those new angles will present themselves because of the fact that styles of play (and rules to a lesser extent) and strategies change over time. And those new angles will not only encompass new ways of looking at existing data such as DIPS and play-by-play but actually new data. One of the most exciting parts of The Numbers Game is the future look at how will be capturing new categories of information related to both offense and defense. And you can bet that the sabermetric community will be there to analyze it.

What this discussion brings me to realize is that the sabermetric community needs to do two things going forward:

1) Solidify its core conclusions while making them presentable to a wider audience
2) Strive to address issues such as those raised by Keith Woolner in addition to those raised by new data collection techniques

In order to do both of these I think what the current community needs is a clearinghouse of sorts where studies, research, techniques, and discussions can take place in an open environment. Being a software developer my thoughts immediately turn to a community web site. So my idea is to create a site analogous to the 123ASPX site created by a friend of mine for collecting resources on Microsoft ASP.NET Development. To that end I’m currently working on an initial design for such a site that includes:

  • Links to sabermetric studies on the web
  • Links to research sites where researchers can find statistics
  • Links to sabermetrically related sites
  • Links to sabermetrics in the news
  • Original articles
  • A glossary of statistical formulas
  • Book reviews
  • Links to sabermetrically related blogs and possible inclusion of the RSS feeds
  • Discussion boards
  • A schedule of events where sabermetrics is discussed

The site would be open to contributions from registered users and would be moderated. I’d love to hear what you think, including a name for the site.

