Mike Fast has a great piece up over on The Hardball Times researching the correlation between working quickly and effectiveness. While I study I did on Baseball Prospectus last May used average game time and was more historical in that it went back to 1970, Mike uses the time stamps that MLBAM is providing in its Pitchf/x data for 2008.
What Mike found largely corresponds to what I concluded, namely that there doesn't appear to be any relationship between defensive support as measured by defensive efficiency (DER) and BABIP and time between pitches for team or individual pitchers measured relative to their teams (although I was using unearned runs instead of DER as I should have).
He does find, however, that when looking at BABIP in terms of the number of seconds that elapsed since the previous pitch, the BABIP is lower for pitches thrown within 10 seconds and higher for pitches thrown in excess of 50 seconds since the previous pitch (he does throw out pitches that came in a minute or more after the previous pitch). As Mike notes, there are other factors to control for, not the least of which are hit type (line drive, fly ball, ground ball, popup) and pitcher quality and hitter quality. Still, it's pretty interesting stuff and just one of the many applications of Pitchf/x data.
Friday, June 13, 2008
Testing an Old Adage...Again
Posted by
Dan Agonistes
at
3:23 PM
7
comments
Wednesday, April 09, 2008
Santana and the Changeup
After a little exchange Ken Davidoff at Newsday wrote a little about Johan Santana and his reliance on his changeup in his column on April 6th. The relevant section reads like this:
According to Fox, MLB.com charted 11 of Santana's outings last year, including his relief effort in the All-Star Game. Of 1,033 pitches, Santana threw 61 percent fastballs, 27 percent changeups and 12 percent sliders, which comes close to Bill James' full-season tally (58-29-11). Not surprisingly, Santana used the changeup far more against righty hitters (he threw it 33 percent of the time) than lefties (7 percent).
Just as Morris suggested, Santana does love using the pitch for strikeouts. Of 269 situations that Fox charted in which Santana had a hitter at 0-and-2, 1-and-2 or 2-and-2, he threw the changeup 123 times. Of the 86 strikeouts Fox witnessed, the changeup produced 53 of them.
Interestingly, of the 11 home runs Santana surrendered on non-full, two-strike counts on Fox's watch, just two came on the changeup, with eight from fastballs and one off a slider.
It turns out that Mike Fast did a nice analysis of Santana back in January and as you would imagine found essentially the same thing albeit in much more detail. From a start by start basis the mix of Santana pitches in 10 of his starts and his All-Star appearance last season can be seen below.
From this it is not apparent that he increasingly used his changeup as the season wore on and in fact it shows a trend where he used his fastball a bit more as the season progressed.
Posted by
Dan Agonistes
at
8:33 AM
7
comments
Tuesday, December 11, 2007
PITCHf/x Musings
Many of the colunms I wrote discussing the PITCHf/x data made available through MLB.com's Gameday system are now available sans subscription on Baseball Prospectus. Those articles are:
Schrodinger's Bat: Free Stuff and the Men in Blue.
Postseason umpiring and an early holiday present for our readers.
Schrodinger's Bat: On Atmosphere, Probability, and Prediction.
Ranging across a couple of old and new themes, explaining that there's something about the weather, and Pythagoras can rock steady.
Schrodinger's Bat: Visualizing Pitches.
After digging through this data, you'll no longer wonder why they say hitting is the hardest thing to do in sports.
Schrodinger's Bat: Putting the Pedal to the Metal.
What happens when pitching in a pinch? Do pitchers have something extra that they can put on the ball when they're in a jam?
Schrodinger's Bat: Calling the Balls and Strikes.
A look umpire tendencies to see how much human error plays a role in calling pitches.
Schrodinger's Bat: Searching for the Gyroball.
Is it there, or isn't it? Dan dives into Dice-K's data to find out.
Schrodinger's Bat: Playing Favorites.
Parsing the data can help us address questions of bias among umpires in calling balls and strikes.
Schrodinger's Bat: Gameday Meets the Knuckleball.
Dan continues his series using pitch data by examining the case of Tim Wakefield.
Schrodinger's Bat: The Science and Art of Building a Better Pitcher Profile.
Popping the hood on King Felix as a demonstration of what's possible with PITCHf/x data
Schrodinger's Bat: Gameday Triple Play.
How different ballparks affect velocity, whether pitchers use the fastball more early in games, and the challenge of quantifying plate discipline.
Schrodinger's Bat: Physics on Display.
Further adventures in pitch-by-pitch data.
Schrodinger's Bat: Batter Versus Pitcher, Gameday Style.
Evaluating the strike zone, the umpires, and some large-scale issues with a tremendous new tool.
Schrodinger's Bat: Phil Hughes, Pitch by Pitch.
Dan uses MLBAM data to reconstruct the no-hitter that wasn't.
Posted by
Dan Agonistes
at
12:50 PM
0
comments
Labels: MLBAM, Pitching, Technology
Wednesday, October 03, 2007
A Little More PITCHf/x
Here's are a couple more PITCHf/x articles:
Posted by
Dan Agonistes
at
1:14 PM
0
comments
Labels: MLBAM
Thursday, September 27, 2007
A Pie and Fish
Today in my column on Baseball Prospectus I answered a couple reader questions in relation to topics from previous weeks. The first question revolved around the repeatability of the baserunning metrics I've developed while the second looked at the Fish, Eye, Square, and Badball metrics using PITCHf/x data from the pitcher's rather than the hitter's perspective. Enjoy.
Posted by
Dan Agonistes
at
10:26 PM
0
comments
Labels: Baserunning, MLBAM
Wednesday, September 12, 2007
Gameday Video
There is a very nice video that explains a bit about the PITCHf/x system over on the Gameday blog. In other PITCHf/x news Joe P. Sheehan has another nice article on sinkers over at Baseball Analysts and tomorrow my column will take another look at plate discipline. And of course anyone interested in this topic should be keeping up with Mike Fast and the work he's doing over on his blog. In particualr he beats me to the punch and uses the approximations given by Dr. Nathan to start calculating spin direction and spin rate. Very cool.
Posted by
Dan Agonistes
at
3:19 PM
1 comments
Saturday, August 25, 2007
Jimenez Looking Good
Tonight Rockies rookie Ubaldo Jimenez turned in a good start for the third consecutive outing beating the Nationals here at Coors Field. I chronicled his arsenal over on the Rocky Mountain SABR site using PITCHf/x data.
Update: Mike Fast and Sky Kalkman point out that the data used to plot the fastball was incorrect. I inadvertantly used a positive rather than a negative vertical acceleration which caused the pitch to appear to level out. I've since corrected the graphs in the article at RMSABR. My apologies.
Posted by
Dan Agonistes
at
9:53 PM
0
comments
Thursday, August 23, 2007
Visualization
My column today on Baseball Prospectus deals with using PITCHf/x data to visualize the trajectory of pitches in much the same way as the actual Gameday application shows each pitch during the game. After discussing how this can be done and plotting a few individual pitches I then aggregate pitch types for a few individuals including Rich Hill, Barry Zito, Roy Halladay, and Derrek Lowe to form a "visual pitch profile" that can be used for comparison. Finally, I look at the complete repertoire of Daisuke Matsuzaka.
After the article was submitted for publication I learned that SABR member Mat Kovach has also been doing this kind of thing.
Update: Just saw that Joe P. Sheehan had done something very similar last week at Baseball Analysts. You know what they say about great minds... :)
Posted by
Dan Agonistes
at
11:54 AM
0
comments
Labels: MLBAM
Friday, August 17, 2007
Umpires and QuesTec
Several readers have been asking about the recent study that was reported to show umpire bias by race known as the Hamermesh study. Phil Birnbaum and Mitchel Lichtman have been doing great work in that regard already so I have little to add other than providing a few links for those interested:
One of the side topics that have arisen here is the affect of QuesTec on called strikes. The authors of the Hamermesh study found that for both white and minority pitchers, in non-QuesTec parks pitchers received a higher percentage of strikes when the race of the pitcher and umpire matched than they did in QuesTec parks. White pitchers did not experience this difference when the umpire was non-white although minority pitchers still did.
This provides an opportunity to look at the PITCHf/x data from this season in QuesTec and non-QuesTec parks to get a more granular feel for what the overall difference might be. While we have data for only 9 of the 11 parks where QuesTec is installed, we still end up with almost 35,000 pitches in QuesTec parks and 63,000 in non-QuesTec parks to analyze. When we do so by comparing the location of the pitch to the strike zone (defined by the PITCHf/x operator for each plate appearance) and give the umpires a 1 inch buffer zone to correspond with the limits of the system, we find the following:
Park Pitches CS% CB% Agree%
QuesTec 34427 .8252 .9433 .8790
Non-QuesTec 62862 .8052 .9488 .8772
By way of explanation CS% is the called strike percentage defined as the percentage of actual pitches in the strike zone that were actually called strikes. CB% is the called ball percentage defined as the percentage of pitches that were actually out of the strike zone that were called balls and Agree% is the overall percentage of pitches on which PITCHf/x (given the buffer zone) and the umpire agreed.
By simply examining the confidence intervals it appears that umpires do indeed call more pitches in the zone strikes at QuesTec parks than at non-QuesTec parks. The difference is statistically significant at .05 at amounts to 1 pitch in 50. However, at QuesTec parks umpires don't do as well at identifying balls and end up calling more of them strikes to the tune of 1 in 180 pitches. This result too is statistically significant at .05 indicating that perhaps the biggest effect of QuesTec is simpy to call more strikes.
Because the factors are working in opposite directions when we add them up the Agree% fails to meet the .05 test. Overall then, if we attribute the entire difference to whether the umpire is in a QuesTec park or not we're talking about a difference of 1 pitch in 550. Of course there may be other factors at work here including the calibration of the system at particular parks that may play a role which I haven't examined.
Posted by
Dan Agonistes
at
7:54 AM
9
comments
Wednesday, August 15, 2007
A Sabermetric Cambrian Explosion
Several folks have alerted me to this article by Nate DiMeo on Slate.com that talks a bit about PITCHf/x and it's promise. What I like about it is that it does a nice job of showing the range of analysis that has already been done (and I like the quote he used as well from this column) and linking to some of those articles.
I've written 10 articles on the subject with a couple more already in the works which include:
In addition Dr. Nathan has created a wonderful page that not only has the most complete data dictionary for the PITCHf/x data but also includes a paper he wrote detailing his own analysis of pitch classification using derived parameters of axis of rotation and spin using a sophisticated model.
Although it may be difficult to detect, one of my goals in researching and writing about the PITCHf/x data this season has been to explore as many avenues of analysis as possible in this early stage when the system is still being tweaked and the data is incomplete. By doing so we can begin to see which of those ideas for analysis are useful and should be developed further as well as to help spur new ideas by other researchers. This is analogous to one of my favorite intellectual ideas, that of the inverted cone of diversity that I also used to help illuminate the evolving way in which players have been used throughout baseball history, and that Stephen Jay Gould was also fond of. From that earlier column the idea is briefly this:
In 1909 Charles Doolittle Walcott discovered a treasure trove of wonderfully unique fossils preserved in a layer of shale near the town of Field in British Columbia, specimens that would become known simply as the Burgess Shale. While Walcott placed his specimens in familiar phyla that were known to exist during the period (Middle Cambrian, 505 million years ago), it was a reinvestigation by Harry Blackmore Whittington, Derek Briggs, and Simon Conway Morris of the University of Cambridge in the 1980s that upended that traditional interpretation of the fossils' place in the evolution of life. By inverting the familiar iconography of the cone of increasing diversity in life forms, Whittington, Briggs, and Morris reinterpreted the Burgess Shale as replete with creatures in phyla that are now extinct. In other words, rather than life becoming increasingly more diverse in terms of its basic body plans over successive geologic periods, the Burgess Shale records an initial flowering of experimentation in structures just after the dawn of life before a later decimation or winnowing into the few surviving phyla we see today. Stephen Jay Gould devoted an entire book to this theme as an illustration akin to his theory of punctuated equilibrium in his 1989 book Wonderful Life: The Burgess Shale and the Nature of History.
So with the introduction of PITCHf/x we're in our own kind of sabermetric Cambrian explosion where ideas are flowering and we're looking for those that survive the selection pressures that prevail.
Where the analogy breaks down however, is that unlike body plans that are almost fully constrained by what went before, ideas never are and so while many of the paths that we'll subsequently travel will come out in the near future, there will always be a decreasing number that are novel and could therefore fundamentally change the way we look at this data.
Updated 8/16/2007: Added new article on pitch speeds with runners on base.
Posted by
Dan Agonistes
at
11:41 AM
0
comments
Labels: MLBAM, Paleontology
Monday, July 30, 2007
Playing Catch Up
After a week communing with family and nature up in Estes Park Colorado we're back and struggling to catch up. Here are a few new things...
Posted by
Dan Agonistes
at
3:54 PM
1 comments
Labels: Baserunning, MLBAM, Umpires
Monday, July 16, 2007
Tip of the Iceberg
A few more links related to research with the new Gameday data.
Most studies have focused on classifying the characteristics of various pitches — Félix Hernández’s four-seam fastball is usually thrown between 94 and 97 miles an hour and breaks around 8 inches toward a right-handed batter — and using them to generate profiles of pitchers (he only throws his changeup 3 percent of the time versus right-handed hitters).
Some work has also been done on identifying batters’ tendencies: Iván Rodríguez swings at nearly 60 percent of pitches thrown to him out of the strike zone, and Juan Pierre makes contact with 92 percent of the balls out of the zone he swings at, for example.
And in talking with Dan as he prepared the piece we discussed the fact that this data provides quantification to concepts that are already well understood in terms of advanced scouting. As Dan says:
“Will chase curveballs low and away” will become “swung and missed at 73 percent of pitches thrown under 83 m.p.h. with a vertical break of at least 12 inches on two-strike counts on the outer third of the plate.”
“Slider lacks bite” could be replaced by “slider begins to break 30 feet from home plate.”
However, it should be noted that pitches aside from the knuckleball do not have early or late break as implied by his comments on sliders and instead break in a uniform way as they travel from the pitcher's hand to home plate.
Two of the aspects that we discussed that I think are particularly interesting he described this way.
The data could be used to evaluate prospects, by answering questions like, “Will he ever learn to lay off a breaking ball?” or to better understand park effects, by revealing just how much movement a particular pitcher could expect to lose from his slider at Coors Field.
By quantifying the characteristics of pitches and building up a historical record we'll be able to ask questions related to age and development across pitch profiles (velocity, trajectory, location, and spin). So for example, it may turn out that certain types of hitters have trouble with certain pitch profiles but that they tend to learn to recognize and lay off the pitch or put it into play with greater success as they age or gain experience. There may be other types of hitters for which this is not true and having the data will at least allow us to ask the question. Of course with historical data the mirror questions can be asked of pitchers as well.
In addition I think we're learning that there are discernible differences in how pitches behave under the different conditions in various parks. PETCO Park for example with its heavier sea air both causes pitches to decelerate more and allows for greater break on spinning pitches. Understanding just what those affects are may allow us to create "pitch profile park effects" that more accurately enable us to predict how a pitcher might fare in a different environment. I've written a bit on this subject already and have been working some with Alan Nathan, a physicist and head of SABR's Science of Baseball committee from the University of Illinois, on this very question and should have some things to share in the near future.
Finally, Dan goes on to say:
But the recent findings represent a tiny fraction of the research that the data will ultimately make possible. Eventually, a large portion of the tasks now done by major league scouts — visually evaluating strengths, weaknesses and trends — will be measured numerically.
While I agree that at the present time we're touching the tip of the proverbial iceberg, I would simply caution that the ability of researchers to ask these questions hinges on two very important conditions. First, as Dan says the data needs to continue to be made available in some form be it subscription based or free. And second, researchers need to understand the limitations of the system not only in terms of accuracy but also variance between ballparks and how the system is being tweaked to provide more accurate data. For example, the in ital point at which pitches are tracked was changed in early June from 55 feet and then experimented with for the rest of the month, settled at 50 feet in early July, and now fluctuating once again in an effort to increase accuracy.
And while I also agree that there are many aspects here that will be quantified and overlap with traditional scouting, it will always be the case that these tools compliment and do not in any sense replace what scouts do. Not only will systems like this not be available in the amateur and minor league circuits for quite some time (not to mention bullpens as Dan mentions), they will be used to augment understanding already gained from traditional methods. For example, in terms of its relationship with bio mechanics analysis like that done by Will Carroll, this system starts after the release point and therefore after everything from tempo to leg kick to balance to arm slot have already taken place.
Posted by
Dan Agonistes
at
10:46 AM
0
comments
Saturday, July 07, 2007
Rain Delay Musings
I'm scoring the Rockies/Phillies game at Coors Field tonight and in the very first inning the rains came causing the game to be delayed. So more to entertain myself than you here are a few random thoughts on the passing scene (to borrow a phrase).
The tarp is coming off the field and so we'll be resuming here at some point. Right now, however, the grounds crew is wrestling with the tarp as it catches 20 mph gusts of wind and drags them to and fro.
Posted by
Dan Agonistes
at
6:39 PM
0
comments
Labels: MLBAM, Pitching, Rockies, Simulation
Thursday, June 28, 2007
The Umpires Strike Back
My column today on Baseball Prospectus takes an initial look at the oft-said belief that hitters with better plate discipline and pitchers with better command end up getting the benefit of the doubt from the man behind the plate. I recall first hearing this idea in the late Ron Luciano's book The Umpire Strikes Back that I read back in 1983 or so. There he says the following regarding pitchers.
During a game an umpire gets into a groove with a pitcher. People like Catfish Hunter [pictured above] and Ron Guidry are always going to be around the plate, so an umpire gets into the habit of calling strikes. Even when they miss the plate, it's usually a situation pitch intended to setup the batter for the next pitch or entice him to swing at a pitch outside the strike zone that he can't hit solidly. The umpire becomes so used to calling strikes that it's difficult to call a ball. Strike one, strike two, foul ball, it's close to the plate, strike three.
Then there are pitchers like Ed Figueroa. He was all over the place. One pitch would be high, the next pitch would be in the dirt, the third pitch would be in the concession stand. He would throw three pitches outside the strike zone, then nip the corner of the plate by a quarter inch and expect the umpire to be ready to call a strike.
Within certain limits we can use the PITCHf/x data to try and get a read on this by measuring the number of called strikes and called balls for pitchers and hitters and how many of each went in favor and opposed to the player. By adding these up and we can then calculate a percentage of pitches for each player. Overall, what we find is that umpires, within the limits of the system, seem to get the calls correct 9 out of 10 times with pitchers getting the small upper-hand. It's also the case that left-handed hitters incur a 10% penalty on called strikes over their right-handed brethren.
You'll have to read the article to see all of the conclusions but suffice it to say that Luciano, if he was speaking for all umpires, may have overstated his case.
Posted by
Dan Agonistes
at
8:44 AM
0
comments
Tuesday, June 26, 2007
Sinkers
As many of you know I've been writing about the PITCHf/x data captured by the new Gameday system the last several weeks in my Schrodinger's Bat column over on Baseball Prospectus. In answering a question for a colleague I ran a query to take a look at which pitchers have the most sink on their sinking fastball and so I'll share the results here.
There is certainly some difficulty in separating sinking fastballs from four-seamers (in some research on Chad Gaudin I found I couldn't reasonably classify some 5% of his fastballs) since the data is continuous and doesn't come nicely labeled. So as a first approximation I thought I'd take a look at all pitches thrown between 87 and 93 miles per hour and that had the appropriate horizontal break for a fastball in order to weed out any sliders. This is similar to what John Walsh did in an excellent article at THT and builds on the work that Joe P. Sheehan did over at Baseball Analysts. The result is the following table of the top 30 pitchers (pitchers who throw from the side excepted since their vertical movement is actually negative in many cases as John discussed).
Name Throws Pitches AvgVel Vert Horiz MaxVel
Felix Hernandez R 69 89.7 2.2 -3.5 92.9
Kameron Loe R 529 89.6 3.7 -7.7 93.0
Derek Lowe R 575 90.2 3.8 -10.7 93.0
Roy Halladay R 481 90.7 3.8 -7.5 93.0
Brandon Webb R 111 89.7 3.9 -9.4 92.9
Julian Tavarez R 296 90.6 3.9 -10.2 92.9
Aaron Cook R 82 91.0 4.2 -7.2 93.0
Tim Hudson R 465 90.8 4.5 -6.8 93.0
Jamey Wright R 72 89.5 4.7 -8.0 93.0
Jeff Weaver R 202 89.1 5.5 -10.8 92.8
Scott Downs L 128 89.3 5.6 11.0 92.2
Jose Contreras R 321 90.2 6.0 -7.7 93.0
Sergio Mitre R 107 90.0 6.0 -9.2 92.6
Chad Paronto R 142 90.0 6.1 -5.8 92.8
Jimmy Speigner R 61 89.5 6.3 -6.0 92.6
Brad Thompson R 56 90.0 6.5 -10.2 92.0
Miguel Batista R 319 91.1 6.5 -6.7 93.0
Paul Maholm L 50 88.5 6.6 6.5 90.6
Zach Duke L 55 88.9 6.7 10.0 91.4
Gil Meche R 60 91.2 6.8 -4.9 93.0
J.J. Putz R 53 89.7 7.0 -6.2 93.0
Oscar Villarreal R 93 90.0 7.0 -6.8 92.9
Chad Gaudin R 437 90.6 7.1 -6.8 93.0
Carlos Zambrano R 113 90.6 7.1 -5.8 93.0
Sean White R 175 91.1 7.2 -8.2 92.9
Eric O'Flaherty L 120 90.2 7.2 6.3 92.8
Jesse Litsch R 62 89.1 7.2 -5.1 92.8
Kip Wells R 82 90.6 7.3 -7.1 93.0
Vicente Padilla R 397 90.9 7.4 -6.9 93.0
Robert Janssen R 102 90.8 7.4 -3.3 93.0
You'll notice that the vertical movement column is still positive for all these pitchers. That's the case because the value is calculated relative to the movement of a theoretical reference pitch that is spinless but thrown in the same way as the pitch in question.
So then to get a feel for what these vertical measurements mean, we can compare them to some pitchers who do not throw a sinking fastball but who do throw their fastballs in the same velocity range. For example, Brad Penny has thrown 230 pitches in this velocity range with an average vertical movement of 12.1 inches. Brandon McCarthy has thrown 264 with a value of 12.1, Randy Wolf has thrown 456 at 11.1, and John Garland has 585 at 10.7. What this indicates is that a four-seamer thrown in the same range drops 10 to 12 inches less than the theoretical reference pitch and so our sinkerballers throw pitches that sink 6 to 9 inches more than that. This seems realistic and of course the list of pitchers near the top (Hernandez, Lowe, Halladay, Webb, Cook) are all the usual suspects.
It's also interesting to note which pitchers have more tail on their sinkers (a negative horizontal movement indicates tailing into a right-handed hitter). Derek Lowe, with his combination of sink and movement, makes it very difficult on opposing hitters.
Posted by
Dan Agonistes
at
8:16 AM
0
comments
Thursday, June 14, 2007
Long Live the King
My column today focuses on creating pitcher profiles using the Gameday data with a case study on Felix Hernandez. It turns out that 415 of Hernandez's approximately 800 pitches in 2007 have been captured by PITCHf/x and so it's interesting to explore generating pitch profiles and various tables and graphs breaking down his pitches every which way from Sunday. For me, it was mostly an exercise in seeing how easy it would be to manipulate the data in various ways but it did quantify Felix's loss of movement and velocity following injury, his reluctance to throw the changeup against right-handed hitters, as well as his tendency to focus on the fastball in the first inning. Of course these are things that have been observed but it's nice to see the supporting evidence as well.
What I find most interesting (and piggy-backing off of the work of others) is that it shows the pitches can be identified and classified using the data (my algorithm was able to hit 95% agreement on his June 10th start based on David Cameron's charting). What I did was exceedingly simple, however, and requires customization for each pitcher. It'll be interesting to see if a system can be devised that classifies the pitches more broadly across a set of pitchers. Still, it seems human interaction will be required but hopefully it can be minimized.
Don't forget about the chat tomorrow!
Posted by
Dan Agonistes
at
11:05 AM
0
comments
Sunday, June 10, 2007
Squaring It Up
In the comments on my post about plate discipline the question of "holes" was brought up. Specifically the idea is whether or not there are hitters who more consistently swing and miss on pitches that are in the strike zone. Well, with the data now becoming available we can add that to our list of things to look at.
The metric is called Square and is defined as the percentage of balls made contact with that were swung at in the strike zone. So if Brian Giles swung at 105 pitches in the strike zone and made contact with 101 of them his Square would be .962. Now perhaps "Square" isn't the best term since I've also included foul balls here but you get the idea. In any case here are here are the percentages for all hitters (and keep in mind we don't have all hitters but only those who've batted in the nine parks in which the PITCHf/x system is running).
Name Stand Pitches SinZ Square
Derek Jeter R 188 52 1.000
Chone Figgins L 144 42 1.000
Placido Polanco R 117 31 1.000
Esteban German R 126 27 1.000
David Dejesus L 209 55 0.982
David Ortiz L 224 50 0.980
Dan Johnson L 411 75 0.973
Jay Payton R 105 37 0.973
Maicer Izturis L 179 32 0.969
Coco Crisp L 168 31 0.968
Rafael Furcal L 406 90 0.967
Juan Pierre L 462 118 0.966
Reggie Willits R 135 28 0.964
Ichiro Suzuki L 552 109 0.963
Trot Nixon L 111 27 0.963
Brian Giles L 381 105 0.962
Kevin Millar R 108 25 0.960
Mark Grudzielanek R 186 48 0.958
Barry Bonds L 122 24 0.958
Ryan Sweeney L 126 22 0.955
Casey Kotchman L 466 106 0.953
Matt Murton R 101 21 0.952
Kenny Lofton L 569 144 0.951
Mike Lowell R 200 60 0.950
Albert Pujols R 108 20 0.950
John McDonald R 170 39 0.949
Mark Ellis R 472 114 0.947
Darin Erstad L 372 74 0.946
Frank Thomas R 517 108 0.944
Pablo Ozuna R 129 36 0.944
Miguel Tejada R 148 36 0.944
Ben Broussard L 126 36 0.944
Ramon Martinez R 124 35 0.943
David Dellucci L 111 34 0.941
Johnny Damon L 190 49 0.939
Lyle Overbay L 381 96 0.938
Frank Catalanotto L 330 78 0.936
Ramon Hernandez R 115 31 0.935
Yuniesky Betancourt R 376 106 0.934
Geoff Blum L 148 29 0.931
Andre Ethier L 413 128 0.930
Shannon Stewart R 464 126 0.929
Jason Kendall R 423 112 0.929
Ryan Zimmerman R 114 28 0.929
Mike Sweeney R 131 41 0.927
Alexis Rios R 467 108 0.926
Brian Mccann L 280 53 0.925
Hideki Matsui L 186 53 0.925
Nick Markakis L 168 52 0.923
Luis Gonzalez L 462 116 0.922
Erick Aybar L 235 64 0.922
Kevin Youkilis R 219 51 0.922
Raul Ibanez L 514 126 0.921
Melvin Mora R 189 50 0.920
Orlando Cabrera R 518 137 0.920
Mike Cuddyer R 155 37 0.919
Brian Roberts L 134 37 0.919
Jerry Hairston R 124 37 0.919
Robinson Cano L 171 36 0.917
Brendan Harris R 165 48 0.917
Nick Swisher R 149 24 0.917
Reggie Willits L 280 59 0.915
Ramon Vazquez L 184 47 0.915
Mark Derosa R 113 35 0.914
Ian Kinsler R 630 163 0.914
J.D. Drew L 184 46 0.913
Jamie Burke R 107 23 0.913
Jose Lopez R 382 102 0.912
Dustin Pedroia R 158 45 0.911
Ken Griffey Jr. L 109 22 0.909
Jose Vidro L 349 87 0.908
Julio Lugo R 226 65 0.908
Gary Sheffield R 168 32 0.906
Kenji Jojima R 284 95 0.905
Sean Casey L 101 21 0.905
Victor Martinez L 100 21 0.905
A.J. Pierzynski L 376 115 0.904
Marco Scutaro R 192 52 0.904
Grady Sizemore L 143 31 0.903
Bengie Molina R 106 31 0.903
Jorge Posada L 136 41 0.902
Gary Matthews Jr. L 449 102 0.902
Jeff Kent R 453 142 0.901
Garret Anderson L 262 50 0.900
Luis Castillo L 121 30 0.900
Tadahito Iguchi R 490 146 0.897
Michael Young R 628 193 0.896
Kelly Johnson L 427 85 0.894
Jermaine Dye R 467 119 0.891
Josh Bard L 308 91 0.890
Adrian Beltre R 441 126 0.889
Howie Kendrick R 165 45 0.889
Curtis Granderson L 147 35 0.886
Aaron Hill R 461 129 0.884
Magglio Ordonez R 150 43 0.884
Doug Mientkiewicz L 113 17 0.882
Paul Konerko R 489 116 0.879
Shea Hillenbrand R 330 91 0.879
Justin Morneau L 137 33 0.879
Edgar Renteria R 379 82 0.878
Willie Harris L 208 49 0.878
Nomar Garciaparra R 470 152 0.875
Carlos Guillen L 112 24 0.875
Scott Thorman L 230 47 0.872
Hank Blalock L 332 78 0.872
Brady Clark R 160 39 0.872
Ivan Rodriguez R 118 31 0.871
Matt Diaz R 191 53 0.868
Alfonso Soriano R 127 30 0.867
Jason Giambi L 130 30 0.867
Marlon Byrd R 174 44 0.864
Gerald Laird R 504 138 0.862
Andy Laroche R 127 29 0.862
Mark Teixeira R 169 43 0.860
Derrek Lee R 146 43 0.860
Chris Woodward R 154 49 0.857
Joe Crede R 376 104 0.856
Vladimir Guerrero R 408 97 0.856
Adam Lind L 374 110 0.855
Khalil Greene R 419 136 0.853
Alex Cintron L 119 34 0.853
Adrian Gonzalez L 641 163 0.853
Juan Uribe R 353 101 0.851
Jose Guillen R 423 121 0.851
Joshua Barfield R 115 40 0.850
Willie Bloomquist R 156 46 0.848
Royce Clayton R 264 78 0.846
Alex Gordon L 163 39 0.846
Aubrey Huff L 123 39 0.846
Robb Quinlan R 135 45 0.844
Bobby Crosby R 489 127 0.843
Chris Stewart R 103 38 0.842
Jose Cruz Jr. L 303 82 0.841
Russell Martin R 543 145 0.841
Travis Buck L 393 88 0.841
Marcus Giles R 504 156 0.840
Vernon Wells R 461 124 0.839
Wilson Betemit L 224 49 0.837
Manny Ramirez R 227 61 0.836
Brad Wilkerson L 263 60 0.833
Bobby Abreu L 225 60 0.833
Emil Brown R 153 36 0.833
Elijah Dukes R 208 47 0.830
Richie Sexson R 485 123 0.829
Kevin Kouzmanoff R 393 117 0.829
Ty Wigginton R 221 64 0.828
Mike Cameron R 589 157 0.828
Eric Chavez L 542 136 0.824
Tony Pena R 130 34 0.824
Jose Molina R 157 45 0.822
Casey Blake R 167 42 0.810
Delmon Young R 232 68 0.809
Jason Phillips R 236 67 0.806
Mark Teixeira L 517 102 0.804
Jeffrey Francoeur R 381 98 0.796
Jim Thome L 359 78 0.795
Gary Matthews Jr. R 118 24 0.792
Jose Cruz Jr. R 108 28 0.786
Milton Bradley L 135 46 0.783
Nick Swisher L 410 86 0.779
Michael Napoli R 349 85 0.776
Rob Mackowiak L 313 76 0.776
Troy Glaus R 353 76 0.776
Mark Teahen L 213 49 0.776
Victor Diaz R 159 49 0.776
Mike Piazza R 186 40 0.775
Carl Crawford L 242 62 0.774
Brandon Inge R 143 35 0.771
Matt Stairs L 202 51 0.765
Alex Rodriguez R 205 55 0.764
Hiram Bocachica R 118 25 0.760
Terrmel Sledge L 301 88 0.750
Rocco Baldelli R 135 32 0.750
Craig Monroe R 139 30 0.733
Craig Wilson R 105 26 0.731
Travis Hafner L 138 22 0.727
Sammy Sosa R 588 164 0.726
Andruw Jones R 433 112 0.723
B.J. Upton R 263 75 0.720
Chipper Jones L 194 39 0.718
Olmedo Saenz R 149 39 0.718
Carlos Pena L 184 46 0.717
Nelson Cruz R 374 100 0.700
Jack Cust L 318 66 0.697
Jason Smith L 103 28 0.679
Rob Bowen L 114 26 0.654
Russ Branyan L 164 49 0.612
Generally speaking the top of the list is populated with contact hitters while the bottom has more power hitters and free swingers as you might expect. A couple surprises to me anyway are David Ortiz and Dan Johnson so high in the list and Chipper Jones (batting left-handed) so low. The average is 87.2% which validates the feeling I always get, especially when attending a game in person, that when a pitch is in the strike zone a major league hitter usually takes advantage.
Posted by
Dan Agonistes
at
7:42 AM
2
comments
Labels: MLBAM, Plate Discipline
Thursday, June 07, 2007
Quantifying Plate Discipline
In my column this morning on Baseball Prospectus (subscription required but well worth it) among other things I take another crack at the PITCHf/x Gameday data. In part, inspired by this fascinating article, I created a couple of metrics to quantify plate discipline. They are:
Without further ado here's the list (with the big caveat that only 24% of all pitches in 2007 have been tracked in the system and there is a heavy bias to the AL West because of the parks that the system is installed in) of all players who have 200 or more pitches tracked this season sorted by "Fish".
Name Stand Pitches Swing Fish BadBall Eye
A.J. Pierzynski L 357 0.602 0.469 0.813 0.117
Garret Anderson L 260 0.515 0.467 0.750 0.350
Delmon Young R 200 0.550 0.454 0.630 0.228
Rob Mackowiak L 297 0.525 0.438 0.738 0.240
Hank Blalock L 332 0.521 0.430 0.632 0.183
Erick Aybar L 235 0.532 0.430 0.869 0.217
Carl Crawford L 216 0.519 0.427 0.738 0.171
Jeffrey Francoeur R 337 0.546 0.425 0.775 0.094
Vladimir Guerrero R 403 0.524 0.421 0.771 0.142
Adrian Beltre R 430 0.512 0.403 0.694 0.234
Michael Young R 585 0.523 0.403 0.766 0.284
Eric Chavez L 519 0.503 0.394 0.736 0.225
Kenji Jojima R 270 0.522 0.389 0.878 0.217
Nomar Garciaparra R 436 0.557 0.388 0.800 0.132
Juan Pierre L 431 0.480 0.385 0.916 0.302
Yuniesky BetancourR 345 0.496 0.383 0.806 0.279
Joe Crede R 376 0.503 0.381 0.729 0.238
Jason Phillips R 203 0.493 0.381 0.721 0.303
Vernon Wells R 417 0.501 0.380 0.713 0.216
Jose Lopez R 347 0.478 0.375 0.853 0.288
Ichiro Suzuki L 515 0.435 0.369 0.862 0.326
Khalil Greene R 407 0.514 0.367 0.595 0.247
Orlando Cabrera R 517 0.468 0.362 0.838 0.279
Brian Mccann L 267 0.434 0.358 0.783 0.282
Kevin Kouzmanoff R 363 0.499 0.350 0.724 0.167
Lyle Overbay L 381 0.465 0.349 0.704 0.230
Sammy Sosa R 532 0.492 0.349 0.625 0.205
Royce Clayton R 227 0.515 0.348 0.457 0.194
Adam Lind L 336 0.482 0.348 0.719 0.253
Andruw Jones R 389 0.478 0.344 0.686 0.232
Jose Guillen R 394 0.475 0.344 0.636 0.232
Alexis Rios R 413 0.426 0.343 0.735 0.351
Marcus Giles R 452 0.511 0.341 0.674 0.211
Bobby Crosby R 474 0.451 0.338 0.652 0.327
Casey Kotchman L 463 0.434 0.338 0.814 0.285
Gary Matthews Jr. L 449 0.441 0.334 0.781 0.223
David Dejesus L 209 0.445 0.333 0.816 0.340
Nelson Cruz R 374 0.457 0.330 0.690 0.268
Andre Ethier L 376 0.500 0.329 0.803 0.200
Jose Vidro L 336 0.461 0.329 0.870 0.200
Juan Uribe R 334 0.476 0.328 0.651 0.232
Raul Ibanez L 474 0.456 0.328 0.760 0.229
Gerald Laird R 484 0.459 0.326 0.698 0.290
Jason Kendall R 405 0.432 0.325 0.870 0.373
Jermaine Dye R 452 0.442 0.324 0.753 0.267
Mark Teahen L 213 0.437 0.324 0.636 0.307
Adrian Gonzalez L 590 0.453 0.322 0.767 0.225
Darin Erstad L 372 0.398 0.318 0.797 0.413
Shea Hillenbrand R 330 0.445 0.308 0.821 0.297
Brad Wilkerson L 263 0.414 0.306 0.755 0.303
Paul Konerko R 457 0.420 0.303 0.765 0.290
Richie Sexson R 448 0.446 0.303 0.732 0.211
Mark Ellis R 440 0.409 0.302 0.851 0.380
Mark Teixeira L 501 0.399 0.302 0.657 0.276
Kenny Lofton L 514 0.434 0.300 0.818 0.239
Travis Buck L 377 0.416 0.300 0.662 0.255
Michael Napoli R 346 0.419 0.299 0.683 0.322
Kelly Johnson L 384 0.393 0.292 0.787 0.315
Josh Bard L 283 0.470 0.292 0.878 0.177
Edgar Renteria R 346 0.405 0.290 0.688 0.275
Jeff Kent R 420 0.476 0.285 0.611 0.176
Ian Kinsler R 591 0.423 0.280 0.784 0.268
Terrmel Sledge L 285 0.446 0.272 0.609 0.217
Rafael Furcal L 371 0.391 0.270 0.857 0.353
Shannon Stewart R 437 0.410 0.269 0.810 0.327
B.J. Upton R 225 0.431 0.269 0.528 0.227
Frank Thomas R 478 0.377 0.267 0.838 0.335
Troy Glaus R 306 0.376 0.264 0.706 0.330
Jose Cruz Jr. L 285 0.428 0.262 0.591 0.216
Aaron Hill R 417 0.415 0.253 0.729 0.302
Willie Harris L 205 0.400 0.248 0.765 0.239
Russell Martin R 506 0.401 0.241 0.757 0.290
Luis Gonzalez L 428 0.395 0.236 0.889 0.256
Nick Swisher L 384 0.365 0.232 0.559 0.307
Tadahito Iguchi R 462 0.422 0.230 0.678 0.243
Frank Catalanotto L 309 0.372 0.228 0.821 0.319
Mike Cameron R 539 0.399 0.224 0.569 0.237
Reggie Willits L 274 0.328 0.219 0.818 0.455
Brian Giles L 381 0.396 0.218 0.804 0.269
Jim Thome L 330 0.358 0.203 0.545 0.239
Bobby Abreu L 203 0.374 0.185 0.545 0.274
Jack Cust L 294 0.310 0.175 0.563 0.342
Wilson Betemit L 208 0.327 0.155 0.700 0.282
Dan Johnson L 387 0.282 0.148 0.737 0.346
Of course the interesting thing is that you can then plot the Fish and Eye metrics on a graph and then split the graph into four quadrants. Each quadrant creates a little profile that can be used to characterize a hitter's plate discipline. Note that the bottom left is the sweet spot.
Posted by
Dan Agonistes
at
11:17 AM
5
comments
Labels: Baseball Prospectus, MLBAM, Plate Discipline
Thursday, May 31, 2007
The Physics of Drag
My column today on Baseball Prospectus delves once again into the PITCHf/x data tracked by the new Gameday application. This time I take a look at the drag on a pitched ball and square the data with the description of the model discussed by Robert Adair in The Physics of Baseball.
To answer the most frequently asked question thus far - no, I haven't looked at Tim Wakefield in any depth. I did see, however, that his average pitch (and I have 346 to look at) lost exactly 10% of its velocity. Overall though that percentage decrease is in line with the following chart (a version of this chart is also in the original article) since his average pFX (which is a measure of the break of the pitch) was 8.6 and his average start speed was 68.5 miles per hour as shown in the chart below.
However, that percentage does not seem to differ by the break length (a different measure of break introduced this year) nor the pFX value. It's also interesting to note that all but one pitch came out of his hand at less than 79 miles per hour. I think it's likely that the Magnus force placed on a knuckler as it moves in various directions tends to slow it down more than one would other think based on the slow speed and lack of spin.
Posted by
Dan Agonistes
at
12:59 PM
3
comments
Thursday, May 24, 2007
Deep Data Dive
As promised yesterday my column on Baseball Prospectus this morning dives deeper into the PITCHf/x data tracked by the 2007 version of Gameday (a new update was released on May 10th and is much more performant).
In this article I take a look at the velocity and location data that includes over 40,000 pitches and discover that given a one-inch margin of error the system agrees with umpires to the tune of 90%. Not bad and very similar to the QuesTec results published by Robert Adair in an article titled "Cameras and Computers, or Umpires?" that was published in Volume 32 of SABR's The Baseball Research Journal.
Posted by
Dan Agonistes
at
12:44 PM
3
comments