My column today focuses on creating pitcher profiles using the Gameday data with a case study on Felix Hernandez. It turns out that 415 of Hernandez's approximately 800 pitches in 2007 have been captured by PITCHf/x and so it's interesting to explore generating pitch profiles and various tables and graphs breaking down his pitches every which way from Sunday. For me, it was mostly an exercise in seeing how easy it would be to manipulate the data in various ways but it did quantify Felix's loss of movement and velocity following injury, his reluctance to throw the changeup against right-handed hitters, as well as his tendency to focus on the fastball in the first inning. Of course these are things that have been observed but it's nice to see the supporting evidence as well.
What I find most interesting (and piggy-backing off of the work of others) is that it shows the pitches can be identified and classified using the data (my algorithm was able to hit 95% agreement on his June 10th start based on David Cameron's charting). What I did was exceedingly simple, however, and requires customization for each pitcher. It'll be interesting to see if a system can be devised that classifies the pitches more broadly across a set of pitchers. Still, it seems human interaction will be required but hopefully it can be minimized.
Don't forget about the chat tomorrow!