FREE hit counter and Internet traffic statistics from freestats.com
Showing posts with label MLBAM. Show all posts
Showing posts with label MLBAM. Show all posts

Friday, June 13, 2008

Testing an Old Adage...Again

Mike Fast has a great piece up over on The Hardball Times researching the correlation between working quickly and effectiveness. While I study I did on Baseball Prospectus last May used average game time and was more historical in that it went back to 1970, Mike uses the time stamps that MLBAM is providing in its Pitchf/x data for 2008.

What Mike found largely corresponds to what I concluded, namely that there doesn't appear to be any relationship between defensive support as measured by defensive efficiency (DER) and BABIP and time between pitches for team or individual pitchers measured relative to their teams (although I was using unearned runs instead of DER as I should have).

He does find, however, that when looking at BABIP in terms of the number of seconds that elapsed since the previous pitch, the BABIP is lower for pitches thrown within 10 seconds and higher for pitches thrown in excess of 50 seconds since the previous pitch (he does throw out pitches that came in a minute or more after the previous pitch). As Mike notes, there are other factors to control for, not the least of which are hit type (line drive, fly ball, ground ball, popup) and pitcher quality and hitter quality. Still, it's pretty interesting stuff and just one of the many applications of Pitchf/x data.

Wednesday, April 09, 2008

Santana and the Changeup

After a little exchange Ken Davidoff at Newsday wrote a little about Johan Santana and his reliance on his changeup in his column on April 6th. The relevant section reads like this:

According to Fox, MLB.com charted 11 of Santana's outings last year, including his relief effort in the All-Star Game. Of 1,033 pitches, Santana threw 61 percent fastballs, 27 percent changeups and 12 percent sliders, which comes close to Bill James' full-season tally (58-29-11). Not surprisingly, Santana used the changeup far more against righty hitters (he threw it 33 percent of the time) than lefties (7 percent).

Just as Morris suggested, Santana does love using the pitch for strikeouts. Of 269 situations that Fox charted in which Santana had a hitter at 0-and-2, 1-and-2 or 2-and-2, he threw the changeup 123 times. Of the 86 strikeouts Fox witnessed, the changeup produced 53 of them.

Interestingly, of the 11 home runs Santana surrendered on non-full, two-strike counts on Fox's watch, just two came on the changeup, with eight from fastballs and one off a slider.

It turns out that Mike Fast did a nice analysis of Santana back in January and as you would imagine found essentially the same thing albeit in much more detail. From a start by start basis the mix of Santana pitches in 10 of his starts and his All-Star appearance last season can be seen below.



From this it is not apparent that he increasingly used his changeup as the season wore on and in fact it shows a trend where he used his fastball a bit more as the season progressed.

Tuesday, December 11, 2007

PITCHf/x Musings

Many of the colunms I wrote discussing the PITCHf/x data made available through MLB.com's Gameday system are now available sans subscription on Baseball Prospectus. Those articles are:

  • October 25, 2007.
    Schrodinger's Bat: Free Stuff and the Men in Blue.
    Postseason umpiring and an early holiday present for our readers.


  • October 11, 2007.
    Schrodinger's Bat: On Atmosphere, Probability, and Prediction.
    Ranging across a couple of old and new themes, explaining that there's something about the weather, and Pythagoras can rock steady.


  • August 23, 2007.
    Schrodinger's Bat: Visualizing Pitches.
    After digging through this data, you'll no longer wonder why they say hitting is the hardest thing to do in sports.


  • August 16, 2007.
    Schrodinger's Bat: Putting the Pedal to the Metal.
    What happens when pitching in a pinch? Do pitchers have something extra that they can put on the ball when they're in a jam?


  • July 26, 2007.
    Schrodinger's Bat: Calling the Balls and Strikes.
    A look umpire tendencies to see how much human error plays a role in calling pitches.


  • July 5, 2007.
    Schrodinger's Bat: Searching for the Gyroball.
    Is it there, or isn't it? Dan dives into Dice-K's data to find out.


  • June 28, 2007.
    Schrodinger's Bat: Playing Favorites.
    Parsing the data can help us address questions of bias among umpires in calling balls and strikes.


  • June 21, 2007.
    Schrodinger's Bat: Gameday Meets the Knuckleball.
    Dan continues his series using pitch data by examining the case of Tim Wakefield.


  • June 14, 2007.
    Schrodinger's Bat: The Science and Art of Building a Better Pitcher Profile.
    Popping the hood on King Felix as a demonstration of what's possible with PITCHf/x data


  • June 7, 2007.
    Schrodinger's Bat: Gameday Triple Play.
    How different ballparks affect velocity, whether pitchers use the fastball more early in games, and the challenge of quantifying plate discipline.


  • May 31, 2007.
    Schrodinger's Bat: Physics on Display.
    Further adventures in pitch-by-pitch data.


  • May 24, 2007.
    Schrodinger's Bat: Batter Versus Pitcher, Gameday Style.
    Evaluating the strike zone, the umpires, and some large-scale issues with a tremendous new tool.


  • May 10, 2007.
    Schrodinger's Bat: Phil Hughes, Pitch by Pitch.
    Dan uses MLBAM data to reconstruct the no-hitter that wasn't.
  • Wednesday, October 03, 2007

    A Little More PITCHf/x

    Here's are a couple more PITCHf/x articles:

  • Joe P. Sheehan uses a similar format to the one I introduced in order to look at Jake Peavy.


  • MLB.com discusses the uses and technology behind the system.
  • Thursday, September 27, 2007

    A Pie and Fish

    Today in my column on Baseball Prospectus I answered a couple reader questions in relation to topics from previous weeks. The first question revolved around the repeatability of the baserunning metrics I've developed while the second looked at the Fish, Eye, Square, and Badball metrics using PITCHf/x data from the pitcher's rather than the hitter's perspective. Enjoy.

    Wednesday, September 12, 2007

    Gameday Video


    There is a very nice video that explains a bit about the PITCHf/x system over on the Gameday blog. In other PITCHf/x news Joe P. Sheehan has another nice article on sinkers over at Baseball Analysts and tomorrow my column will take another look at plate discipline. And of course anyone interested in this topic should be keeping up with Mike Fast and the work he's doing over on his blog. In particualr he beats me to the punch and uses the approximations given by Dr. Nathan to start calculating spin direction and spin rate. Very cool.

    Saturday, August 25, 2007

    Jimenez Looking Good

    Tonight Rockies rookie Ubaldo Jimenez turned in a good start for the third consecutive outing beating the Nationals here at Coors Field. I chronicled his arsenal over on the Rocky Mountain SABR site using PITCHf/x data.

    Update: Mike Fast and Sky Kalkman point out that the data used to plot the fastball was incorrect. I inadvertantly used a positive rather than a negative vertical acceleration which caused the pitch to appear to level out. I've since corrected the graphs in the article at RMSABR. My apologies.

    Thursday, August 23, 2007

    Visualization

    My column today on Baseball Prospectus deals with using PITCHf/x data to visualize the trajectory of pitches in much the same way as the actual Gameday application shows each pitch during the game. After discussing how this can be done and plotting a few individual pitches I then aggregate pitch types for a few individuals including Rich Hill, Barry Zito, Roy Halladay, and Derrek Lowe to form a "visual pitch profile" that can be used for comparison. Finally, I look at the complete repertoire of Daisuke Matsuzaka.

    After the article was submitted for publication I learned that SABR member Mat Kovach has also been doing this kind of thing.

    Update: Just saw that Joe P. Sheehan had done something very similar last week at Baseball Analysts. You know what they say about great minds... :)

    Friday, August 17, 2007

    Umpires and QuesTec

    Several readers have been asking about the recent study that was reported to show umpire bias by race known as the Hamermesh study. Phil Birnbaum and Mitchel Lichtman have been doing great work in that regard already so I have little to add other than providing a few links for those interested:

  • The original study


  • The Time Magazine piece


  • Phil's first take - he questions the author's findings of statistical significance by examining the core table (table 2) from the original study


  • Phil's follow-up - where he uses to conclude that perhaps and at most 1 in 700 pitches is biased


  • And even more by Phil - here he uses several different tests of significance and it appears there is no racial bias


  • MGL's own study - here he uses a much simpler approach and comes to the tentative conclusion that there are not racial differences that are statistically significant. Update on 8/19: MGL posted some updates to his study here and here and comes to the opposite conclusion. He also notes there is a good discussion of the study at The Sports Economist.


  • One of the side topics that have arisen here is the affect of QuesTec on called strikes. The authors of the Hamermesh study found that for both white and minority pitchers, in non-QuesTec parks pitchers received a higher percentage of strikes when the race of the pitcher and umpire matched than they did in QuesTec parks. White pitchers did not experience this difference when the umpire was non-white although minority pitchers still did.

    This provides an opportunity to look at the PITCHf/x data from this season in QuesTec and non-QuesTec parks to get a more granular feel for what the overall difference might be. While we have data for only 9 of the 11 parks where QuesTec is installed, we still end up with almost 35,000 pitches in QuesTec parks and 63,000 in non-QuesTec parks to analyze. When we do so by comparing the location of the pitch to the strike zone (defined by the PITCHf/x operator for each plate appearance) and give the umpires a 1 inch buffer zone to correspond with the limits of the system, we find the following:


    Park Pitches CS% CB% Agree%
    QuesTec 34427 .8252 .9433 .8790
    Non-QuesTec 62862 .8052 .9488 .8772


    By way of explanation CS% is the called strike percentage defined as the percentage of actual pitches in the strike zone that were actually called strikes. CB% is the called ball percentage defined as the percentage of pitches that were actually out of the strike zone that were called balls and Agree% is the overall percentage of pitches on which PITCHf/x (given the buffer zone) and the umpire agreed.

    By simply examining the confidence intervals it appears that umpires do indeed call more pitches in the zone strikes at QuesTec parks than at non-QuesTec parks. The difference is statistically significant at .05 at amounts to 1 pitch in 50. However, at QuesTec parks umpires don't do as well at identifying balls and end up calling more of them strikes to the tune of 1 in 180 pitches. This result too is statistically significant at .05 indicating that perhaps the biggest effect of QuesTec is simpy to call more strikes.

    Because the factors are working in opposite directions when we add them up the Agree% fails to meet the .05 test. Overall then, if we attribute the entire difference to whether the umpire is in a QuesTec park or not we're talking about a difference of 1 pitch in 550. Of course there may be other factors at work here including the calibration of the system at particular parks that may play a role which I haven't examined.

    Wednesday, August 15, 2007

    A Sabermetric Cambrian Explosion

    Several folks have alerted me to this article by Nate DiMeo on Slate.com that talks a bit about PITCHf/x and it's promise. What I like about it is that it does a nice job of showing the range of analysis that has already been done (and I like the quote he used as well from this column) and linking to some of those articles.

    I've written 10 articles on the subject with a couple more already in the works which include:

  • Schrodinger's Bat: Putting the Pedal to the Metal - August 16


  • Schrodinger's Bat: Calling the Balls and Strikes - July 26


  • Schrodinger's Bat: Searching for the Gyroball - July 5


  • Schrodinger's Bat: Playing Favorites - June 28


  • Schrodinger's Bat: Gameday Meets the Knuckleball - June 21


  • Schrodinger's Bat: The Science and Art of Building a Better Pitcher Profile - June 14


  • Schrodinger's Bat: Gameday Triple Play - June 7


  • Schrodinger's Bat: Physics on Display - May 31


  • Schrodinger's Bat: Batter Versus Pitcher, Gameday Style - May 4


  • Schrodinger's Bat: Phil Hughes, Pitch by Pitch - May 10


  • Schrodinger's Bat: The Information Revolution - October 26, 2006


  • In addition Dr. Nathan has created a wonderful page that not only has the most complete data dictionary for the PITCHf/x data but also includes a paper he wrote detailing his own analysis of pitch classification using derived parameters of axis of rotation and spin using a sophisticated model.

    Although it may be difficult to detect, one of my goals in researching and writing about the PITCHf/x data this season has been to explore as many avenues of analysis as possible in this early stage when the system is still being tweaked and the data is incomplete. By doing so we can begin to see which of those ideas for analysis are useful and should be developed further as well as to help spur new ideas by other researchers. This is analogous to one of my favorite intellectual ideas, that of the inverted cone of diversity that I also used to help illuminate the evolving way in which players have been used throughout baseball history, and that Stephen Jay Gould was also fond of. From that earlier column the idea is briefly this:

    In 1909 Charles Doolittle Walcott discovered a treasure trove of wonderfully unique fossils preserved in a layer of shale near the town of Field in British Columbia, specimens that would become known simply as the Burgess Shale. While Walcott placed his specimens in familiar phyla that were known to exist during the period (Middle Cambrian, 505 million years ago), it was a reinvestigation by Harry Blackmore Whittington, Derek Briggs, and Simon Conway Morris of the University of Cambridge in the 1980s that upended that traditional interpretation of the fossils' place in the evolution of life. By inverting the familiar iconography of the cone of increasing diversity in life forms, Whittington, Briggs, and Morris reinterpreted the Burgess Shale as replete with creatures in phyla that are now extinct. In other words, rather than life becoming increasingly more diverse in terms of its basic body plans over successive geologic periods, the Burgess Shale records an initial flowering of experimentation in structures just after the dawn of life before a later decimation or winnowing into the few surviving phyla we see today. Stephen Jay Gould devoted an entire book to this theme as an illustration akin to his theory of punctuated equilibrium in his 1989 book Wonderful Life: The Burgess Shale and the Nature of History.

    So with the introduction of PITCHf/x we're in our own kind of sabermetric Cambrian explosion where ideas are flowering and we're looking for those that survive the selection pressures that prevail.

    Where the analogy breaks down however, is that unlike body plans that are almost fully constrained by what went before, ideas never are and so while many of the paths that we'll subsequently travel will come out in the near future, there will always be a decreasing number that are novel and could therefore fundamentally change the way we look at this data.


    Updated 8/16/2007: Added new article on pitch speeds with runners on base.

    Monday, July 30, 2007

    Playing Catch Up

    After a week communing with family and nature up in Estes Park Colorado we're back and struggling to catch up. Here are a few new things...

  • Baserunning Metrics - as promised the leaders and trailers in baserunning are now up on Baseball Prospectus.


  • Stringing - The Rocky Mountain News wrote a little article on Gameday and a little about being a stringer with a few quotes from yours truly. The printed version had a nice picture of Mike Hageman who is the veteran among our crew at Coors Field.


  • Enhanced Gameday Links - Nice summary on a Reds blog of some of the analysis done using the PITCHf/x data.


  • Umpires - Speaking of which, my column last week was on umpire accurracy and named names. It's interesting that in that analysis I identified Doug Eddings as the second most pitcher-friendly umpire (albeit based only on 603 called pitches) behind Jeff Nelson as he gave the pitcher 38 more calls than hitters accounting for 6.3% of called balls and strikes. On Saturday night I took eleven family members to the Rockies/Dodgers game where Eddings was behind the plate and it certainly appeared that Eddings was calling a very low strike zone much to the pitcher's advantage.
  • Monday, July 16, 2007

    Tip of the Iceberg

    A few more links related to research with the new Gameday data.

  • It’s a Pitch-by-Pitch Scouting Report, Minus the Scout. This article by Dan Rosenheck appeared in the Keeping Score column in the New York Times over the weekend. He references a few of the columns I've written at BP in the following comments:

    Most studies have focused on classifying the characteristics of various pitches — Félix Hernández’s four-seam fastball is usually thrown between 94 and 97 miles an hour and breaks around 8 inches toward a right-handed batter — and using them to generate profiles of pitchers (he only throws his changeup 3 percent of the time versus right-handed hitters).

    Some work has also been done on identifying batters’ tendencies: Iván Rodríguez swings at nearly 60 percent of pitches thrown to him out of the strike zone, and Juan Pierre makes contact with 92 percent of the balls out of the zone he swings at, for example.

    And in talking with Dan as he prepared the piece we discussed the fact that this data provides quantification to concepts that are already well understood in terms of advanced scouting. As Dan says:

    “Will chase curveballs low and away” will become “swung and missed at 73 percent of pitches thrown under 83 m.p.h. with a vertical break of at least 12 inches on two-strike counts on the outer third of the plate.”

    “Slider lacks bite” could be replaced by “slider begins to break 30 feet from home plate.”

    However, it should be noted that pitches aside from the knuckleball do not have early or late break as implied by his comments on sliders and instead break in a uniform way as they travel from the pitcher's hand to home plate.

    Two of the aspects that we discussed that I think are particularly interesting he described this way.

    The data could be used to evaluate prospects, by answering questions like, “Will he ever learn to lay off a breaking ball?” or to better understand park effects, by revealing just how much movement a particular pitcher could expect to lose from his slider at Coors Field.

    By quantifying the characteristics of pitches and building up a historical record we'll be able to ask questions related to age and development across pitch profiles (velocity, trajectory, location, and spin). So for example, it may turn out that certain types of hitters have trouble with certain pitch profiles but that they tend to learn to recognize and lay off the pitch or put it into play with greater success as they age or gain experience. There may be other types of hitters for which this is not true and having the data will at least allow us to ask the question. Of course with historical data the mirror questions can be asked of pitchers as well.

    In addition I think we're learning that there are discernible differences in how pitches behave under the different conditions in various parks. PETCO Park for example with its heavier sea air both causes pitches to decelerate more and allows for greater break on spinning pitches. Understanding just what those affects are may allow us to create "pitch profile park effects" that more accurately enable us to predict how a pitcher might fare in a different environment. I've written a bit on this subject already and have been working some with Alan Nathan, a physicist and head of SABR's Science of Baseball committee from the University of Illinois, on this very question and should have some things to share in the near future.

    Finally, Dan goes on to say:

    But the recent findings represent a tiny fraction of the research that the data will ultimately make possible. Eventually, a large portion of the tasks now done by major league scouts — visually evaluating strengths, weaknesses and trends — will be measured numerically.

    While I agree that at the present time we're touching the tip of the proverbial iceberg, I would simply caution that the ability of researchers to ask these questions hinges on two very important conditions. First, as Dan says the data needs to continue to be made available in some form be it subscription based or free. And second, researchers need to understand the limitations of the system not only in terms of accuracy but also variance between ballparks and how the system is being tweaked to provide more accurate data. For example, the in ital point at which pitches are tracked was changed in early June from 55 feet and then experimented with for the rest of the month, settled at 50 feet in early July, and now fluctuating once again in an effort to increase accuracy.

    And while I also agree that there are many aspects here that will be quantified and overlap with traditional scouting, it will always be the case that these tools compliment and do not in any sense replace what scouts do. Not only will systems like this not be available in the amateur and minor league circuits for quite some time (not to mention bullpens as Dan mentions), they will be used to augment understanding already gained from traditional methods. For example, in terms of its relationship with bio mechanics analysis like that done by Will Carroll, this system starts after the release point and therefore after everything from tempo to leg kick to balance to arm slot have already taken place.


  • Under Pressure. Joe P. Sheehan at Baseball Analysts looks at the relation of pitch types to Leverage - something that had not occurred to me. While it's certainly interesting and he shows, for example, that Jake Peavy relies more on his slider than his fastball in pressure situations, I think you'd also have to normalize the data for the base/out and handedness of the batter. It could be that Peavy relies more on his slider in pressure situations because he relies more on it with runners on base which also happen to have higher Leverage indexes.


  • Strike Zone: Fact vs. Fiction. John Walsh totally steals my thunder by examining the actual dimensions of the strike zone as it is called by major league umpires. What I find interesting is that he notes that right-handed hitters end up having to defend a strike zone that is slightly larger while I've found that left-handers are getting 10% more strikes called against them on pitches out of the strike zone. In looking at John's data I think the reason for this is that left-handers have to defend more territory on the outside part of the plate and pitchers concentrate on this area throwing a disproportionate number of their pitches in that region.


  • Another look at the sinker. Louis Chao at THT looks at contact rates by pitch types and finds, a little surprisingly, that sinkers have higher contact rates than fastballs. My take is that sinkers drop more in accordance with what the hitter is expecting and so they're able to put the bat on the ball albeit typically driving it into the ground. Four-seam fastballs, on the other hand, do not drop as much as would be expected and so batters swing under them. This is supported by the fact that a four-seamer typically drops 10-15 inches less than the theoretical reference pitch while a sinker drops only 2 to 7 inches less.
  • Saturday, July 07, 2007

    Rain Delay Musings

    I'm scoring the Rockies/Phillies game at Coors Field tonight and in the very first inning the rains came causing the game to be delayed. So more to entertain myself than you here are a few random thoughts on the passing scene (to borrow a phrase).

  • Angering the Gods? - Before the rain delay Jimmy Rollins homered to right field on the 3rd pitch of the game from Rodrigo Lopez. After a Chase Utley double off the right field wall with one out Ryan Howard lofted a fly ball down the left field line and it just made its way over the wall for a two-run homer. Before the homer lightning had been evident out beyond left field and I think when Howard hit that ball there was an accompanying roar of thunder. Very strange and more than a little distracting to say the least. Rollins homerun was the 100th of his career and on 6/27 Howard hit his 100th and in the process becoming the fastest player to do so (325 games).


  • More Pitchf/x - A couple more articles on using the Gameday data for analysis were written by Joe P. Sheehan and John Beamer. In Joe's article I love the idea of looking at this data from a consistency standpoint and am not surprised that Joe doesn't see much of a difference between starts. While it's probably the case that there are some pitchers whose success hinges on good stuff on a particular night, those are likely the more marginal pitchers. And my assumption is that location and pitch selection in specific situations would vary more than a measurement of "stuff". Better pitchers like Halladay simply have good stuff and differences between good and bad starts also have a lot to do with simply luck in which balls are hit at fielders and which aren't. This would be a topic to revisit after we have a few thousand pitches for starters in good and bad starts. John, on the other hand does a profile of Tim Hudson much like I've done for King Felix, Tim Wakefield, and Daisuke Matsuzaka. What he adds, and what I'm just getting around to, is sorting out pitch types using a clustering algorithm. I've recently gotten ahold of an implementation of one such algorithm in my programming language of choice so I'm excited to see how it can be modified or customized for this type of data.


  • And a Clarification - Related to PITCHf/x one of the questions I get most frequently and haven't explained very well is just what the measurements of vertical and horizontal movement are relative to. In short, they are measured relative to a theoretical "reference pitch" defined as a pitch thrown at the minimu with the same release velocity and release point but with no spin. Therefore a pitch that has a vertical movement of 11 inches like many four-seam fastballs doesn't actually rise four inches but rather drops 11 inches less than the reference pitch. As a result the way to get a more intuitive measurements of the movement of a pitch, especially vertically, one needs to have a feel for the trajectory of the reference pitch. While this can be inferred from looking at the data I don't possess the parameters for calculating exactly the trajectory of the reference pitch.


  • MLB2K7 Gets a Thumbs Up - I bought this game for XBox 360 a month or two ago and have great fun playing it. Compared to MLB2K6 this is head and shoulders a better game. They've adjusted the way fielding works by allowing you to have your fielders sprint and their reaction times to your movements are much snappier. The graphics are better and there seem to be more signature stances and windups for various players. I've had no issues with the game seeming to go crazy in terms of offense or pitching when playing in franchise mode like I did last year and the various game levels seem to incrementally make the game harder in an appropriate way. I'm playing the Cubs in franchise mode and am hovering at .500 by playing some of the games and managing others. Unfortunately the manager interface has not changed and so it still lacks all of the strategic options of the real-time game play and most annoyingly does not let you warm up pitchers when at bat. One cool thing is that through XBox Live I was able to download Olympic Stadium and they say there will be other retro parks added in the future including the Polo Grounds and Fores Field among others.


  • Ted and Ted - I didn't realize that Cubs pitcher Ted Lilly was named after Teddy Roosevelt who once employed Lilly's grandfather. Very cool.


  • Rox Surge? - The Rockies are now 25-16 since May 22nd (second only to Seattle at 28-15) after last night's walk-off win in extras after a Brad Hawpe two-out, homer to dead center in the bottom of the ninth tied the game. The Rox are now 25-18 at home and 18-25 on the road following that miserable 1-9 road trip where they lost in walk-off fashion four times and that saw Brian Fuentes blow four saves. The offense seems to be firing on all cylinders and it's clear that the addition of Ryan Spilborghs after the John Mabry/Steve Finley experiments failed miserably. Spilborghs has 24 RBI with his 26 hits and is 4 for 5 in his most recent pinch-hitting chances and 7 for 17 overall. One of the things that has most impressed me this season in watching the Rockies is how Willy Taveras bunts. He now has 27 bunt hits on the season with the next closest player having 8. He also has 40 infield hits which leads the majors. I haven't looked it up but my guess is that he's successful around 80% of the time when attempting to bunt for a hit which seems pretty remarkable.


  • The tarp is coming off the field and so we'll be resuming here at some point. Right now, however, the grounds crew is wrestling with the tarp as it catches 20 mph gusts of wind and drags them to and fro.

    Thursday, June 28, 2007

    The Umpires Strike Back


    My column today on Baseball Prospectus takes an initial look at the oft-said belief that hitters with better plate discipline and pitchers with better command end up getting the benefit of the doubt from the man behind the plate. I recall first hearing this idea in the late Ron Luciano's book The Umpire Strikes Back that I read back in 1983 or so. There he says the following regarding pitchers.

    During a game an umpire gets into a groove with a pitcher. People like Catfish Hunter [pictured above] and Ron Guidry are always going to be around the plate, so an umpire gets into the habit of calling strikes. Even when they miss the plate, it's usually a situation pitch intended to setup the batter for the next pitch or entice him to swing at a pitch outside the strike zone that he can't hit solidly. The umpire becomes so used to calling strikes that it's difficult to call a ball. Strike one, strike two, foul ball, it's close to the plate, strike three.

    Then there are pitchers like Ed Figueroa. He was all over the place. One pitch would be high, the next pitch would be in the dirt, the third pitch would be in the concession stand. He would throw three pitches outside the strike zone, then nip the corner of the plate by a quarter inch and expect the umpire to be ready to call a strike.

    Within certain limits we can use the PITCHf/x data to try and get a read on this by measuring the number of called strikes and called balls for pitchers and hitters and how many of each went in favor and opposed to the player. By adding these up and we can then calculate a percentage of pitches for each player. Overall, what we find is that umpires, within the limits of the system, seem to get the calls correct 9 out of 10 times with pitchers getting the small upper-hand. It's also the case that left-handed hitters incur a 10% penalty on called strikes over their right-handed brethren.

    You'll have to read the article to see all of the conclusions but suffice it to say that Luciano, if he was speaking for all umpires, may have overstated his case.

    Tuesday, June 26, 2007

    Sinkers

    As many of you know I've been writing about the PITCHf/x data captured by the new Gameday system the last several weeks in my Schrodinger's Bat column over on Baseball Prospectus. In answering a question for a colleague I ran a query to take a look at which pitchers have the most sink on their sinking fastball and so I'll share the results here.

    There is certainly some difficulty in separating sinking fastballs from four-seamers (in some research on Chad Gaudin I found I couldn't reasonably classify some 5% of his fastballs) since the data is continuous and doesn't come nicely labeled. So as a first approximation I thought I'd take a look at all pitches thrown between 87 and 93 miles per hour and that had the appropriate horizontal break for a fastball in order to weed out any sliders. This is similar to what John Walsh did in an excellent article at THT and builds on the work that Joe P. Sheehan did over at Baseball Analysts. The result is the following table of the top 30 pitchers (pitchers who throw from the side excepted since their vertical movement is actually negative in many cases as John discussed).


    Name Throws Pitches AvgVel Vert Horiz MaxVel
    Felix Hernandez R 69 89.7 2.2 -3.5 92.9
    Kameron Loe R 529 89.6 3.7 -7.7 93.0
    Derek Lowe R 575 90.2 3.8 -10.7 93.0
    Roy Halladay R 481 90.7 3.8 -7.5 93.0
    Brandon Webb R 111 89.7 3.9 -9.4 92.9
    Julian Tavarez R 296 90.6 3.9 -10.2 92.9
    Aaron Cook R 82 91.0 4.2 -7.2 93.0
    Tim Hudson R 465 90.8 4.5 -6.8 93.0
    Jamey Wright R 72 89.5 4.7 -8.0 93.0
    Jeff Weaver R 202 89.1 5.5 -10.8 92.8
    Scott Downs L 128 89.3 5.6 11.0 92.2
    Jose Contreras R 321 90.2 6.0 -7.7 93.0
    Sergio Mitre R 107 90.0 6.0 -9.2 92.6
    Chad Paronto R 142 90.0 6.1 -5.8 92.8
    Jimmy Speigner R 61 89.5 6.3 -6.0 92.6
    Brad Thompson R 56 90.0 6.5 -10.2 92.0
    Miguel Batista R 319 91.1 6.5 -6.7 93.0
    Paul Maholm L 50 88.5 6.6 6.5 90.6
    Zach Duke L 55 88.9 6.7 10.0 91.4
    Gil Meche R 60 91.2 6.8 -4.9 93.0
    J.J. Putz R 53 89.7 7.0 -6.2 93.0
    Oscar Villarreal R 93 90.0 7.0 -6.8 92.9
    Chad Gaudin R 437 90.6 7.1 -6.8 93.0
    Carlos Zambrano R 113 90.6 7.1 -5.8 93.0
    Sean White R 175 91.1 7.2 -8.2 92.9
    Eric O'Flaherty L 120 90.2 7.2 6.3 92.8
    Jesse Litsch R 62 89.1 7.2 -5.1 92.8
    Kip Wells R 82 90.6 7.3 -7.1 93.0
    Vicente Padilla R 397 90.9 7.4 -6.9 93.0
    Robert Janssen R 102 90.8 7.4 -3.3 93.0


    You'll notice that the vertical movement column is still positive for all these pitchers. That's the case because the value is calculated relative to the movement of a theoretical reference pitch that is spinless but thrown in the same way as the pitch in question.

    So then to get a feel for what these vertical measurements mean, we can compare them to some pitchers who do not throw a sinking fastball but who do throw their fastballs in the same velocity range. For example, Brad Penny has thrown 230 pitches in this velocity range with an average vertical movement of 12.1 inches. Brandon McCarthy has thrown 264 with a value of 12.1, Randy Wolf has thrown 456 at 11.1, and John Garland has 585 at 10.7. What this indicates is that a four-seamer thrown in the same range drops 10 to 12 inches less than the theoretical reference pitch and so our sinkerballers throw pitches that sink 6 to 9 inches more than that. This seems realistic and of course the list of pitchers near the top (Hernandez, Lowe, Halladay, Webb, Cook) are all the usual suspects.

    It's also interesting to note which pitchers have more tail on their sinkers (a negative horizontal movement indicates tailing into a right-handed hitter). Derek Lowe, with his combination of sink and movement, makes it very difficult on opposing hitters.

    Thursday, June 14, 2007

    Long Live the King


    My column today focuses on creating pitcher profiles using the Gameday data with a case study on Felix Hernandez. It turns out that 415 of Hernandez's approximately 800 pitches in 2007 have been captured by PITCHf/x and so it's interesting to explore generating pitch profiles and various tables and graphs breaking down his pitches every which way from Sunday. For me, it was mostly an exercise in seeing how easy it would be to manipulate the data in various ways but it did quantify Felix's loss of movement and velocity following injury, his reluctance to throw the changeup against right-handed hitters, as well as his tendency to focus on the fastball in the first inning. Of course these are things that have been observed but it's nice to see the supporting evidence as well.

    What I find most interesting (and piggy-backing off of the work of others) is that it shows the pitches can be identified and classified using the data (my algorithm was able to hit 95% agreement on his June 10th start based on David Cameron's charting). What I did was exceedingly simple, however, and requires customization for each pitcher. It'll be interesting to see if a system can be devised that classifies the pitches more broadly across a set of pitchers. Still, it seems human interaction will be required but hopefully it can be minimized.

    Don't forget about the chat tomorrow!

    Sunday, June 10, 2007

    Squaring It Up

    In the comments on my post about plate discipline the question of "holes" was brought up. Specifically the idea is whether or not there are hitters who more consistently swing and miss on pitches that are in the strike zone. Well, with the data now becoming available we can add that to our list of things to look at.

    The metric is called Square and is defined as the percentage of balls made contact with that were swung at in the strike zone. So if Brian Giles swung at 105 pitches in the strike zone and made contact with 101 of them his Square would be .962. Now perhaps "Square" isn't the best term since I've also included foul balls here but you get the idea. In any case here are here are the percentages for all hitters (and keep in mind we don't have all hitters but only those who've batted in the nine parks in which the PITCHf/x system is running).


    Name Stand Pitches SinZ Square
    Derek Jeter R 188 52 1.000
    Chone Figgins L 144 42 1.000
    Placido Polanco R 117 31 1.000
    Esteban German R 126 27 1.000
    David Dejesus L 209 55 0.982
    David Ortiz L 224 50 0.980
    Dan Johnson L 411 75 0.973
    Jay Payton R 105 37 0.973
    Maicer Izturis L 179 32 0.969
    Coco Crisp L 168 31 0.968
    Rafael Furcal L 406 90 0.967
    Juan Pierre L 462 118 0.966
    Reggie Willits R 135 28 0.964
    Ichiro Suzuki L 552 109 0.963
    Trot Nixon L 111 27 0.963
    Brian Giles L 381 105 0.962
    Kevin Millar R 108 25 0.960
    Mark Grudzielanek R 186 48 0.958
    Barry Bonds L 122 24 0.958
    Ryan Sweeney L 126 22 0.955
    Casey Kotchman L 466 106 0.953
    Matt Murton R 101 21 0.952
    Kenny Lofton L 569 144 0.951
    Mike Lowell R 200 60 0.950
    Albert Pujols R 108 20 0.950
    John McDonald R 170 39 0.949
    Mark Ellis R 472 114 0.947
    Darin Erstad L 372 74 0.946
    Frank Thomas R 517 108 0.944
    Pablo Ozuna R 129 36 0.944
    Miguel Tejada R 148 36 0.944
    Ben Broussard L 126 36 0.944
    Ramon Martinez R 124 35 0.943
    David Dellucci L 111 34 0.941
    Johnny Damon L 190 49 0.939
    Lyle Overbay L 381 96 0.938
    Frank Catalanotto L 330 78 0.936
    Ramon Hernandez R 115 31 0.935
    Yuniesky Betancourt R 376 106 0.934
    Geoff Blum L 148 29 0.931
    Andre Ethier L 413 128 0.930
    Shannon Stewart R 464 126 0.929
    Jason Kendall R 423 112 0.929
    Ryan Zimmerman R 114 28 0.929
    Mike Sweeney R 131 41 0.927
    Alexis Rios R 467 108 0.926
    Brian Mccann L 280 53 0.925
    Hideki Matsui L 186 53 0.925
    Nick Markakis L 168 52 0.923
    Luis Gonzalez L 462 116 0.922
    Erick Aybar L 235 64 0.922
    Kevin Youkilis R 219 51 0.922
    Raul Ibanez L 514 126 0.921
    Melvin Mora R 189 50 0.920
    Orlando Cabrera R 518 137 0.920
    Mike Cuddyer R 155 37 0.919
    Brian Roberts L 134 37 0.919
    Jerry Hairston R 124 37 0.919
    Robinson Cano L 171 36 0.917
    Brendan Harris R 165 48 0.917
    Nick Swisher R 149 24 0.917
    Reggie Willits L 280 59 0.915
    Ramon Vazquez L 184 47 0.915
    Mark Derosa R 113 35 0.914
    Ian Kinsler R 630 163 0.914
    J.D. Drew L 184 46 0.913
    Jamie Burke R 107 23 0.913
    Jose Lopez R 382 102 0.912
    Dustin Pedroia R 158 45 0.911
    Ken Griffey Jr. L 109 22 0.909
    Jose Vidro L 349 87 0.908
    Julio Lugo R 226 65 0.908
    Gary Sheffield R 168 32 0.906
    Kenji Jojima R 284 95 0.905
    Sean Casey L 101 21 0.905
    Victor Martinez L 100 21 0.905
    A.J. Pierzynski L 376 115 0.904
    Marco Scutaro R 192 52 0.904
    Grady Sizemore L 143 31 0.903
    Bengie Molina R 106 31 0.903
    Jorge Posada L 136 41 0.902
    Gary Matthews Jr. L 449 102 0.902
    Jeff Kent R 453 142 0.901
    Garret Anderson L 262 50 0.900
    Luis Castillo L 121 30 0.900
    Tadahito Iguchi R 490 146 0.897
    Michael Young R 628 193 0.896
    Kelly Johnson L 427 85 0.894
    Jermaine Dye R 467 119 0.891
    Josh Bard L 308 91 0.890
    Adrian Beltre R 441 126 0.889
    Howie Kendrick R 165 45 0.889
    Curtis Granderson L 147 35 0.886
    Aaron Hill R 461 129 0.884
    Magglio Ordonez R 150 43 0.884
    Doug Mientkiewicz L 113 17 0.882
    Paul Konerko R 489 116 0.879
    Shea Hillenbrand R 330 91 0.879
    Justin Morneau L 137 33 0.879
    Edgar Renteria R 379 82 0.878
    Willie Harris L 208 49 0.878
    Nomar Garciaparra R 470 152 0.875
    Carlos Guillen L 112 24 0.875
    Scott Thorman L 230 47 0.872
    Hank Blalock L 332 78 0.872
    Brady Clark R 160 39 0.872
    Ivan Rodriguez R 118 31 0.871
    Matt Diaz R 191 53 0.868
    Alfonso Soriano R 127 30 0.867
    Jason Giambi L 130 30 0.867
    Marlon Byrd R 174 44 0.864
    Gerald Laird R 504 138 0.862
    Andy Laroche R 127 29 0.862
    Mark Teixeira R 169 43 0.860
    Derrek Lee R 146 43 0.860
    Chris Woodward R 154 49 0.857
    Joe Crede R 376 104 0.856
    Vladimir Guerrero R 408 97 0.856
    Adam Lind L 374 110 0.855
    Khalil Greene R 419 136 0.853
    Alex Cintron L 119 34 0.853
    Adrian Gonzalez L 641 163 0.853
    Juan Uribe R 353 101 0.851
    Jose Guillen R 423 121 0.851
    Joshua Barfield R 115 40 0.850
    Willie Bloomquist R 156 46 0.848
    Royce Clayton R 264 78 0.846
    Alex Gordon L 163 39 0.846
    Aubrey Huff L 123 39 0.846
    Robb Quinlan R 135 45 0.844
    Bobby Crosby R 489 127 0.843
    Chris Stewart R 103 38 0.842
    Jose Cruz Jr. L 303 82 0.841
    Russell Martin R 543 145 0.841
    Travis Buck L 393 88 0.841
    Marcus Giles R 504 156 0.840
    Vernon Wells R 461 124 0.839
    Wilson Betemit L 224 49 0.837
    Manny Ramirez R 227 61 0.836
    Brad Wilkerson L 263 60 0.833
    Bobby Abreu L 225 60 0.833
    Emil Brown R 153 36 0.833
    Elijah Dukes R 208 47 0.830
    Richie Sexson R 485 123 0.829
    Kevin Kouzmanoff R 393 117 0.829
    Ty Wigginton R 221 64 0.828
    Mike Cameron R 589 157 0.828
    Eric Chavez L 542 136 0.824
    Tony Pena R 130 34 0.824
    Jose Molina R 157 45 0.822
    Casey Blake R 167 42 0.810
    Delmon Young R 232 68 0.809
    Jason Phillips R 236 67 0.806
    Mark Teixeira L 517 102 0.804
    Jeffrey Francoeur R 381 98 0.796
    Jim Thome L 359 78 0.795
    Gary Matthews Jr. R 118 24 0.792
    Jose Cruz Jr. R 108 28 0.786
    Milton Bradley L 135 46 0.783
    Nick Swisher L 410 86 0.779
    Michael Napoli R 349 85 0.776
    Rob Mackowiak L 313 76 0.776
    Troy Glaus R 353 76 0.776
    Mark Teahen L 213 49 0.776
    Victor Diaz R 159 49 0.776
    Mike Piazza R 186 40 0.775
    Carl Crawford L 242 62 0.774
    Brandon Inge R 143 35 0.771
    Matt Stairs L 202 51 0.765
    Alex Rodriguez R 205 55 0.764
    Hiram Bocachica R 118 25 0.760
    Terrmel Sledge L 301 88 0.750
    Rocco Baldelli R 135 32 0.750
    Craig Monroe R 139 30 0.733
    Craig Wilson R 105 26 0.731
    Travis Hafner L 138 22 0.727
    Sammy Sosa R 588 164 0.726
    Andruw Jones R 433 112 0.723
    B.J. Upton R 263 75 0.720
    Chipper Jones L 194 39 0.718
    Olmedo Saenz R 149 39 0.718
    Carlos Pena L 184 46 0.717
    Nelson Cruz R 374 100 0.700
    Jack Cust L 318 66 0.697
    Jason Smith L 103 28 0.679
    Rob Bowen L 114 26 0.654
    Russ Branyan L 164 49 0.612


    Generally speaking the top of the list is populated with contact hitters while the bottom has more power hitters and free swingers as you might expect. A couple surprises to me anyway are David Ortiz and Dan Johnson so high in the list and Chipper Jones (batting left-handed) so low. The average is 87.2% which validates the feeling I always get, especially when attending a game in person, that when a pitch is in the strike zone a major league hitter usually takes advantage.

    Thursday, June 07, 2007

    Quantifying Plate Discipline

    In my column this morning on Baseball Prospectus (subscription required but well worth it) among other things I take another crack at the PITCHf/x Gameday data. In part, inspired by this fascinating article, I created a couple of metrics to quantify plate discipline. They are:

  • Swing (S) defined as the percentage of pitches the batter swung at and also available in many other places. Obviously high values here are indicative of aggressive hitters or hitters who see a greater percentage of pitches out of the strike zone.


  • Fish (F) defined as the percentage of pitches out of the strike zone that the hitter swung at. A higher percentage here indicates that the hitter may have trouble recognizing pitches since he is offering at pitches that would likely otherwise be called balls. It should be noted that the strike zone as defined for this analysis is 17 inches wide (the standard) and uses the actual height customized for the player. No buffer room was added as was done in the previous articles since here we're not concerned with giving the umpire the benefit of the doubt.


  • Bad Ball (BB) defined as the percentage of pitches out of the strike zone that were swung at and made contact with (including foul balls although there is an argument to be made that a foul ball is not the intended outcome and so should be discounted in some way). A higher value in this category indicates that the hitter, when swinging at bad pitches, is at least able to get the bat on the ball.


  • Eye (E) defined as the percentage of pitches in the strike zone on non-three and zero counts that were taken for strikes. A smaller value in this metric indicates a player who recognizes strikes and aggressively offers at them. Non three and zero counts were excluded since obviously a hitter is much more likely to let a strike go by in this situation and so we don’t want to penalize them for that behavior. Some readers will see, however, where this idea could be extended to each of the eight possible counts and a system devised where less penalty is credited to hitters who take at 3-1 than those that do so at 0-2.


  • Without further ado here's the list (with the big caveat that only 24% of all pitches in 2007 have been tracked in the system and there is a heavy bias to the AL West because of the parks that the system is installed in) of all players who have 200 or more pitches tracked this season sorted by "Fish".


    Name Stand Pitches Swing Fish BadBall Eye
    A.J. Pierzynski L 357 0.602 0.469 0.813 0.117
    Garret Anderson L 260 0.515 0.467 0.750 0.350
    Delmon Young R 200 0.550 0.454 0.630 0.228
    Rob Mackowiak L 297 0.525 0.438 0.738 0.240
    Hank Blalock L 332 0.521 0.430 0.632 0.183
    Erick Aybar L 235 0.532 0.430 0.869 0.217
    Carl Crawford L 216 0.519 0.427 0.738 0.171
    Jeffrey Francoeur R 337 0.546 0.425 0.775 0.094
    Vladimir Guerrero R 403 0.524 0.421 0.771 0.142
    Adrian Beltre R 430 0.512 0.403 0.694 0.234
    Michael Young R 585 0.523 0.403 0.766 0.284
    Eric Chavez L 519 0.503 0.394 0.736 0.225
    Kenji Jojima R 270 0.522 0.389 0.878 0.217
    Nomar Garciaparra R 436 0.557 0.388 0.800 0.132
    Juan Pierre L 431 0.480 0.385 0.916 0.302
    Yuniesky BetancourR 345 0.496 0.383 0.806 0.279
    Joe Crede R 376 0.503 0.381 0.729 0.238
    Jason Phillips R 203 0.493 0.381 0.721 0.303
    Vernon Wells R 417 0.501 0.380 0.713 0.216
    Jose Lopez R 347 0.478 0.375 0.853 0.288
    Ichiro Suzuki L 515 0.435 0.369 0.862 0.326
    Khalil Greene R 407 0.514 0.367 0.595 0.247
    Orlando Cabrera R 517 0.468 0.362 0.838 0.279
    Brian Mccann L 267 0.434 0.358 0.783 0.282
    Kevin Kouzmanoff R 363 0.499 0.350 0.724 0.167
    Lyle Overbay L 381 0.465 0.349 0.704 0.230
    Sammy Sosa R 532 0.492 0.349 0.625 0.205
    Royce Clayton R 227 0.515 0.348 0.457 0.194
    Adam Lind L 336 0.482 0.348 0.719 0.253
    Andruw Jones R 389 0.478 0.344 0.686 0.232
    Jose Guillen R 394 0.475 0.344 0.636 0.232
    Alexis Rios R 413 0.426 0.343 0.735 0.351
    Marcus Giles R 452 0.511 0.341 0.674 0.211
    Bobby Crosby R 474 0.451 0.338 0.652 0.327
    Casey Kotchman L 463 0.434 0.338 0.814 0.285
    Gary Matthews Jr. L 449 0.441 0.334 0.781 0.223
    David Dejesus L 209 0.445 0.333 0.816 0.340
    Nelson Cruz R 374 0.457 0.330 0.690 0.268
    Andre Ethier L 376 0.500 0.329 0.803 0.200
    Jose Vidro L 336 0.461 0.329 0.870 0.200
    Juan Uribe R 334 0.476 0.328 0.651 0.232
    Raul Ibanez L 474 0.456 0.328 0.760 0.229
    Gerald Laird R 484 0.459 0.326 0.698 0.290
    Jason Kendall R 405 0.432 0.325 0.870 0.373
    Jermaine Dye R 452 0.442 0.324 0.753 0.267
    Mark Teahen L 213 0.437 0.324 0.636 0.307
    Adrian Gonzalez L 590 0.453 0.322 0.767 0.225
    Darin Erstad L 372 0.398 0.318 0.797 0.413
    Shea Hillenbrand R 330 0.445 0.308 0.821 0.297
    Brad Wilkerson L 263 0.414 0.306 0.755 0.303
    Paul Konerko R 457 0.420 0.303 0.765 0.290
    Richie Sexson R 448 0.446 0.303 0.732 0.211
    Mark Ellis R 440 0.409 0.302 0.851 0.380
    Mark Teixeira L 501 0.399 0.302 0.657 0.276
    Kenny Lofton L 514 0.434 0.300 0.818 0.239
    Travis Buck L 377 0.416 0.300 0.662 0.255
    Michael Napoli R 346 0.419 0.299 0.683 0.322
    Kelly Johnson L 384 0.393 0.292 0.787 0.315
    Josh Bard L 283 0.470 0.292 0.878 0.177
    Edgar Renteria R 346 0.405 0.290 0.688 0.275
    Jeff Kent R 420 0.476 0.285 0.611 0.176
    Ian Kinsler R 591 0.423 0.280 0.784 0.268
    Terrmel Sledge L 285 0.446 0.272 0.609 0.217
    Rafael Furcal L 371 0.391 0.270 0.857 0.353
    Shannon Stewart R 437 0.410 0.269 0.810 0.327
    B.J. Upton R 225 0.431 0.269 0.528 0.227
    Frank Thomas R 478 0.377 0.267 0.838 0.335
    Troy Glaus R 306 0.376 0.264 0.706 0.330
    Jose Cruz Jr. L 285 0.428 0.262 0.591 0.216
    Aaron Hill R 417 0.415 0.253 0.729 0.302
    Willie Harris L 205 0.400 0.248 0.765 0.239
    Russell Martin R 506 0.401 0.241 0.757 0.290
    Luis Gonzalez L 428 0.395 0.236 0.889 0.256
    Nick Swisher L 384 0.365 0.232 0.559 0.307
    Tadahito Iguchi R 462 0.422 0.230 0.678 0.243
    Frank Catalanotto L 309 0.372 0.228 0.821 0.319
    Mike Cameron R 539 0.399 0.224 0.569 0.237
    Reggie Willits L 274 0.328 0.219 0.818 0.455
    Brian Giles L 381 0.396 0.218 0.804 0.269
    Jim Thome L 330 0.358 0.203 0.545 0.239
    Bobby Abreu L 203 0.374 0.185 0.545 0.274
    Jack Cust L 294 0.310 0.175 0.563 0.342
    Wilson Betemit L 208 0.327 0.155 0.700 0.282
    Dan Johnson L 387 0.282 0.148 0.737 0.346


    Of course the interesting thing is that you can then plot the Fish and Eye metrics on a graph and then split the graph into four quadrants. Each quadrant creates a little profile that can be used to characterize a hitter's plate discipline. Note that the bottom left is the sweet spot.

    Thursday, May 31, 2007

    The Physics of Drag

    My column today on Baseball Prospectus delves once again into the PITCHf/x data tracked by the new Gameday application. This time I take a look at the drag on a pitched ball and square the data with the description of the model discussed by Robert Adair in The Physics of Baseball.

    To answer the most frequently asked question thus far - no, I haven't looked at Tim Wakefield in any depth. I did see, however, that his average pitch (and I have 346 to look at) lost exactly 10% of its velocity. Overall though that percentage decrease is in line with the following chart (a version of this chart is also in the original article) since his average pFX (which is a measure of the break of the pitch) was 8.6 and his average start speed was 68.5 miles per hour as shown in the chart below.



    However, that percentage does not seem to differ by the break length (a different measure of break introduced this year) nor the pFX value. It's also interesting to note that all but one pitch came out of his hand at less than 79 miles per hour. I think it's likely that the Magnus force placed on a knuckler as it moves in various directions tends to slow it down more than one would other think based on the slow speed and lack of spin.

    Thursday, May 24, 2007

    Deep Data Dive

    As promised yesterday my column on Baseball Prospectus this morning dives deeper into the PITCHf/x data tracked by the 2007 version of Gameday (a new update was released on May 10th and is much more performant).

    In this article I take a look at the velocity and location data that includes over 40,000 pitches and discover that given a one-inch margin of error the system agrees with umpires to the tune of 90%. Not bad and very similar to the QuesTec results published by Robert Adair in an article titled "Cameras and Computers, or Umpires?" that was published in Volume 32 of SABR's The Baseball Research Journal.