FREE hit counter and Internet traffic statistics from freestats.com

Monday, July 16, 2007

Tip of the Iceberg

A few more links related to research with the new Gameday data.

  • It’s a Pitch-by-Pitch Scouting Report, Minus the Scout. This article by Dan Rosenheck appeared in the Keeping Score column in the New York Times over the weekend. He references a few of the columns I've written at BP in the following comments:

    Most studies have focused on classifying the characteristics of various pitches — Félix Hernández’s four-seam fastball is usually thrown between 94 and 97 miles an hour and breaks around 8 inches toward a right-handed batter — and using them to generate profiles of pitchers (he only throws his changeup 3 percent of the time versus right-handed hitters).

    Some work has also been done on identifying batters’ tendencies: Iván Rodríguez swings at nearly 60 percent of pitches thrown to him out of the strike zone, and Juan Pierre makes contact with 92 percent of the balls out of the zone he swings at, for example.

    And in talking with Dan as he prepared the piece we discussed the fact that this data provides quantification to concepts that are already well understood in terms of advanced scouting. As Dan says:

    “Will chase curveballs low and away” will become “swung and missed at 73 percent of pitches thrown under 83 m.p.h. with a vertical break of at least 12 inches on two-strike counts on the outer third of the plate.”

    “Slider lacks bite” could be replaced by “slider begins to break 30 feet from home plate.”

    However, it should be noted that pitches aside from the knuckleball do not have early or late break as implied by his comments on sliders and instead break in a uniform way as they travel from the pitcher's hand to home plate.

    Two of the aspects that we discussed that I think are particularly interesting he described this way.

    The data could be used to evaluate prospects, by answering questions like, “Will he ever learn to lay off a breaking ball?” or to better understand park effects, by revealing just how much movement a particular pitcher could expect to lose from his slider at Coors Field.

    By quantifying the characteristics of pitches and building up a historical record we'll be able to ask questions related to age and development across pitch profiles (velocity, trajectory, location, and spin). So for example, it may turn out that certain types of hitters have trouble with certain pitch profiles but that they tend to learn to recognize and lay off the pitch or put it into play with greater success as they age or gain experience. There may be other types of hitters for which this is not true and having the data will at least allow us to ask the question. Of course with historical data the mirror questions can be asked of pitchers as well.

    In addition I think we're learning that there are discernible differences in how pitches behave under the different conditions in various parks. PETCO Park for example with its heavier sea air both causes pitches to decelerate more and allows for greater break on spinning pitches. Understanding just what those affects are may allow us to create "pitch profile park effects" that more accurately enable us to predict how a pitcher might fare in a different environment. I've written a bit on this subject already and have been working some with Alan Nathan, a physicist and head of SABR's Science of Baseball committee from the University of Illinois, on this very question and should have some things to share in the near future.

    Finally, Dan goes on to say:

    But the recent findings represent a tiny fraction of the research that the data will ultimately make possible. Eventually, a large portion of the tasks now done by major league scouts — visually evaluating strengths, weaknesses and trends — will be measured numerically.

    While I agree that at the present time we're touching the tip of the proverbial iceberg, I would simply caution that the ability of researchers to ask these questions hinges on two very important conditions. First, as Dan says the data needs to continue to be made available in some form be it subscription based or free. And second, researchers need to understand the limitations of the system not only in terms of accuracy but also variance between ballparks and how the system is being tweaked to provide more accurate data. For example, the in ital point at which pitches are tracked was changed in early June from 55 feet and then experimented with for the rest of the month, settled at 50 feet in early July, and now fluctuating once again in an effort to increase accuracy.

    And while I also agree that there are many aspects here that will be quantified and overlap with traditional scouting, it will always be the case that these tools compliment and do not in any sense replace what scouts do. Not only will systems like this not be available in the amateur and minor league circuits for quite some time (not to mention bullpens as Dan mentions), they will be used to augment understanding already gained from traditional methods. For example, in terms of its relationship with bio mechanics analysis like that done by Will Carroll, this system starts after the release point and therefore after everything from tempo to leg kick to balance to arm slot have already taken place.


  • Under Pressure. Joe P. Sheehan at Baseball Analysts looks at the relation of pitch types to Leverage - something that had not occurred to me. While it's certainly interesting and he shows, for example, that Jake Peavy relies more on his slider than his fastball in pressure situations, I think you'd also have to normalize the data for the base/out and handedness of the batter. It could be that Peavy relies more on his slider in pressure situations because he relies more on it with runners on base which also happen to have higher Leverage indexes.


  • Strike Zone: Fact vs. Fiction. John Walsh totally steals my thunder by examining the actual dimensions of the strike zone as it is called by major league umpires. What I find interesting is that he notes that right-handed hitters end up having to defend a strike zone that is slightly larger while I've found that left-handers are getting 10% more strikes called against them on pitches out of the strike zone. In looking at John's data I think the reason for this is that left-handers have to defend more territory on the outside part of the plate and pitchers concentrate on this area throwing a disproportionate number of their pitches in that region.


  • Another look at the sinker. Louis Chao at THT looks at contact rates by pitch types and finds, a little surprisingly, that sinkers have higher contact rates than fastballs. My take is that sinkers drop more in accordance with what the hitter is expecting and so they're able to put the bat on the ball albeit typically driving it into the ground. Four-seam fastballs, on the other hand, do not drop as much as would be expected and so batters swing under them. This is supported by the fact that a four-seamer typically drops 10-15 inches less than the theoretical reference pitch while a sinker drops only 2 to 7 inches less.
  • No comments: