Friday, December 30, 2005

Love to Bunt

As a followup to my latest article on bunting on THT and as a preview of the next article here's a few numbers on sacrifices that were attempted and successful over the past three seasons (2003-2005). You'll have to read the article to ascertain how sacrifice attempts are counting and the success criteria.

Att Succ Pct
7252 5550 0.765306

And by position...a subject I discuss more in the next article...

Position Att Succ Pct
7 356 304 0.854
9 267 223 0.835
4 861 719 0.835
3 137 114 0.832
6 1006 836 0.831
8 693 571 0.824
2 549 451 0.821
10 84 69 0.821
5 378 292 0.772
11 217 161 0.742
1 2704 1810 0.669

And by inning...

Inning Att Succ Pct
1 512 448 0.875
2 783 562 0.718
3 998 714 0.715
4 704 527 0.749
5 961 705 0.734
6 683 551 0.807
7 822 647 0.787
8 819 638 0.779
9 588 450 0.765
10 159 138 0.868
11 105 73 0.695
12 57 46 0.807
13 25 21 0.840
14 15 13 0.867
15 15 12 0.800
16 3 2 0.667
17 2 2 1.000
19 1 1 1.000

I also took a look at those who had 20 or more sacrifice attempts in the period. Here they are ranked by percentage...

>=20 Attempts
Att Succ Pct
Luis Castillo 47 47 1.000
Timo Perez 22 22 1.000
Jerry Hairston 24 23 0.958
Julio Lugo 22 21 0.955
Nook Logan 21 20 0.952
Derek Jeter 41 39 0.951
Deivi Cruz 20 19 0.950
Ryan Freel 19 18 0.947
Alex Cora 30 28 0.933
Miguel Cairo 30 28 0.933
Paul Lo Duca 27 25 0.926
Juan Uribe 34 31 0.912
Omar Vizquel 55 50 0.909
Edgar Renteria 22 20 0.909
Matt Morris 32 29 0.906
Brandon Inge 21 19 0.905
Javier Vazquez 21 19 0.905
Kenny Lofton 21 19 0.905
David Eckstein 40 36 0.900
Luis Gonzalez 20 18 0.900
Michael Tucker 20 18 0.900
Endy Chavez 38 34 0.895
Adam Everett 51 45 0.882
Livan Hernandez 41 36 0.878
Coco Crisp 41 36 0.878
Juan Pierre 57 50 0.877
Tony Womack 32 28 0.875
Jack Wilson 39 34 0.872
Randy Winn 31 27 0.871
Craig Biggio 23 20 0.870
Luis Matos 23 20 0.870
Ramon Santiago 30 26 0.867
Aaron Miles 22 19 0.864
Chone Figgins 35 30 0.857
Russ Ortiz 28 24 0.857
Henry Blanco 21 18 0.857
Andy Pettitte 21 18 0.857
Ramon Martinez 20 17 0.850
Melvin Mora 33 28 0.848
Royce Clayton 50 42 0.840
Angel Berroa 37 31 0.838
Omar Infante 24 20 0.833
Joe McEwing 24 20 0.833
Scott Podsednik 35 29 0.829
Neifi Perez 40 33 0.825
Dave Roberts 28 23 0.821
Tom Glavine 28 23 0.821
Kris Benson 33 27 0.818
Nick Green 22 18 0.818
Claudio Vargas 22 18 0.818
Brian Roberts 32 26 0.813
Marcus Giles 25 20 0.800
Kirk Rueter 20 16 0.800
Roy Oswalt 34 27 0.794
Jason Schmidt 43 34 0.791
Alex Sanchez 52 41 0.788
Brad Ausmus 23 18 0.783
Placido Polanco 23 18 0.783
Ronnie Belliard 23 18 0.783
Brett Tomko 45 35 0.778
Alex Cintron 27 21 0.778
Jeff Suppan 31 24 0.774
Willie Harris 22 17 0.773
Steve Trachsel 30 23 0.767
Woody Williams 30 23 0.767
Jamey Carroll 32 24 0.750
Mike Hampton 24 18 0.750
Jerome Williams 24 18 0.750
Jose Valentin 24 18 0.750
Corey Patterson 24 18 0.750
Kerry Wood 20 15 0.750
Miguel Olivo 20 15 0.750
Jeff Weaver 20 15 0.750
Brad Penny 20 15 0.750
Aaron Cook 20 15 0.750
Randy Wolf 20 15 0.750
Juan Castro 26 19 0.731
Josh Fogg 33 24 0.727
Carlos Zambrano 22 16 0.727
Cristian Guzman 53 38 0.717
Quinton McCracken 21 15 0.714
Cory Lidle 24 17 0.708
Brett Myers 34 24 0.706
Odalis Perez 34 24 0.706
John Thomson 27 19 0.704
Mark Prior 27 19 0.704
Greg Maddux 36 25 0.694
Rafael Furcal 26 18 0.692
Aaron Rowand 26 18 0.692
Oliver Perez 29 20 0.690
Shawn Estes 28 19 0.679
Cesar Izturis 39 26 0.667
Brandon Webb 36 24 0.667
Adam Eaton 24 16 0.667
Kazuhisa Ishii 29 19 0.655
Jake Peavy 29 19 0.655
Ben Sheets 33 21 0.636
John Patterson 22 14 0.636
AJ Burnett 27 17 0.630
Matt Clement 21 13 0.619
Josh Beckett 31 19 0.613
Brian Lawrence 31 19 0.613
Carl Pavano 23 14 0.609
Dontrelle Willis 22 13 0.591
Jason Jennings 26 15 0.577
Chris Carpenter 27 15 0.556
Vicente Padilla 20 11 0.550
Paul Wilson 26 13 0.500
Al Leiter 22 11 0.500
Kevin Millwood 23 11 0.478
Horacio Ramirez 21 10 0.476
Kip Wells 24 11 0.458
Eric Milton 22 10 0.455
Aaron Harang 20 8 0.400
Ramon Ortiz 19 7 0.368
Mark Redman 23 8 0.348
Doug Davis 29 9 0.310

Note that Matt Morris tops the list of pitchers. Interesting that Luis Castillo has been perfect. Stay tuned.

Saturday, December 24, 2005

Jacque in Wrigley

David Appleman of FanGraphs has a nice article on THT on the new Cubs Jacque Jones. He shows that Jones decrease in batting average since 2003 has likely been because he's hitting fewer line drives. Interestingly, when he hits fly balls they leave the yard with alarming regularity. The problem is that he's an extreme ground ball hitter. That's not good news for him or the Cubs as he shifts to Wrigley where the grass is high and the alleys short.

Tuesday, December 20, 2005

Could Be Worse: Cubs 2006

Well, I've held my tongue long enough and so I just have to share my thoughts on the Cubs 2005-2006 offseason thus far.

  • Signed Glendon Rusch for 2 years at $6M. Not a bad move to sign him as an insurnace policy since he can both start and relieve. Good move.

  • Picked up options on Todd Walker and Scott Williamson. Both good moves since Walker is probably as good at secondbase as they would have been able to get and Williamson is worth the risk.

  • Declined option on Jeremy Burnitz and bought him out for $500K. Not a bad move if, and I say again if, you actually have a plan for filling the right field slot. Burnitz had the kind of year you expect from him but he contributes both defensively and on the bases.

  • Signed Ryan Dempster for 3 years at $15.5M. Probably overpaid for a guy who, to me anyway, just isn't cut out to be a closer. Yes, he had a great run in 2005 but I'll be surprised if we see that out of him again.

  • Signed Neifi Perez for years and too much money. Why two years? I don't mind having him back as a backup but we all know that Dusty will rely on him in all kinds of situations and he'll bat leadoff or second and do alot of drag bunting...yada yada yada. A bad move based on the two year contract and the fact that he'll take at bats from Ronnie Cedeno who looked good last year and is having a good winter.

  • Traded John Leceister to the Ranger for a PTBNL. 27 year old that didn't really figure in the Cubs plans. Remains to be seen whether this one is good or bad.

  • Signed LHP Scott Eyre to two years plus an option for 2008. Now 34 years old and coming off the best season of his career. Look for him to regress. Will Ohman wasn't quite as good last year but probably good enough to eat those innings as the LOOGY.

  • Signed RHP Bobby Howry to a three year deal. Back to back good seasons in Cleveland and at 32 years old should be good for one or two more. Not a bad deal and makes me wonder why they spent $15.5M on Dempster.

  • Traded Jermain Van Buren to the Red Sox for a PTBNL. Hard thrower whose strikeout rates increased the last two seasons in AA and AAA. Hope they get somebody decent or this might be a big bust.

  • Acquired Juan Pierre for Sergio Mitre, Ricky Nolasco, and Renyel Pinto. Nolasco at 22 years old went 14-3 at AA with 173/46 K/BB ratio in 161.7 innings. Pinto has always had more control issues but did fairly well at AA last year (101 hits in 129.7 IP). And of course we all know Mitre has a great arm and could be the Marlins number two starter. This was alot to give up for an outfielder with a .350 SLUG. Major injuries to Mark Prior, Kerry Wood, and/or Carlos Zambrano could have the Cubs wishing they still had these three arms. Too early to tell.

  • Signed John Mabry to a one-year deal and designated Jose Macias for assignment. Half of this I like. Mabry, on the other hand, struggled last year and probably won't be getting any better at age 35. I've often called for the Cubs to strengthen their bench by acquiring some hitters but this aint it.

  • Picked up LHP Carlos Jan, RHP Geivy Garcia, and INF Aaron Rifkin in the Rule 5 draft. Then sent Rifkin to the Rockies for a PTBNL. Jan hasn't played above A ball and is a hard thrower with control issues. Garcia has struggled in A ball the last two seasons and isn't going anywhere. We'll see what they get for Rifkin but there are no contributors here. They also lost Juan Mateo who struck out 123 in 109.7 innings and walked only 27 in A-ball last year so it appears the Rule 5 was a net negative this year.

  • Signed Jacque Jones for three years and $16M. Ouch. I said earlier that it was OK to let Burnitz walk if you had a plan. Obviously they didn't and had to grab what they could. On the wrong side of 30 Jones hasn't played well since 2003 and so I wouldn't expect him to even replace Burnitz's declining Win Shares. Bad idea for one year let alone three.

  • I also see that they've offered a contract to Corey Patterson since they now need him to compete for the left field job with Matt Murton with Pierre in center and Jones in right. Those four together combine for a pretty weak offensive outfield.

    Thus far I'd give Jim Hendry and the gang a C-. Could be worse. And it probably is as Kerry Wood will not be ready by the start of the season. Just waiting for the other shoe (or achilles tendon) to drop regarding the health of Mark Prior.

    Projected Opening day lineup
    CF Pierre
    2B Walker
    1B Lee
    3B Ramirez
    RF Jones
    C Barrett
    LF Patterson/Murton
    SS Cedeno
    P Prior

    Bunting Redux

    Back in May I posted some thoughts about the probability of a successful sacrifice. I've now extended that a little bit in an article posted this morning on THT. Turns out my previous estimate was a little low and sacrifices are probably successful around 76% of the time. Enjoy.

    Saturday, December 17, 2005

    The Crash of 2006?

    Last week T.J. Quinn of the New York Daily News wrote an article entitled "Post-steroid era is eye-opener" that discussed the drug issue in baseball in the context of the winter meetings in Dallas.

    Interestingly Quinn makes the argument that baseball's new and tougher drug policies are making it more difficult for statistically-minded front offices since it is harder to evaluate players based on statistics.

    "With baseball ushering in a new policy for next season - the Players Association approved it unanimously this past week - scouts and executives agree that they are still sorting out the end of baseball's steroid era. Statistics compiled in the years before baseball started incrementally toughening its policy in 2003 are considered suspect, making it tough to plug numbers into a computer to determine a player's value. "

    That makes sense since some player's career trajectories have been altered by the use of steroids. With baseball cleaning up its act, it stands to reason that those players who were using performance enhancers and have quit will suffer more precipitous declines in performance than they would have otherwise. This makes a GM's job harder if he's thinking about signing a veteran with an established level of performance.

    Quinn then discusses the situation from the perspective of the Mets GM Omar Minaya.

    "As for the more immediate problem of evaluating players who might have been doping for years, Minaya, a classic lifelong baseball man with a scouting background, says the tougher anti-doping rules have eliminated some of the guesswork from his job. 'The past couple of years we've been conscious of a potential problem and it seems to be getting rectified,' he said. 'Before, you wondered if a performance was enhanced or not. I trust the numbers the last two years.'"

    What's missing here, and what Quinn acknowledges, is that it is still easy to beat baseball's drug policy. Human growth hormone, for example, still cannot be tested for.

    However, Quinn believes that the amphetamine policy will have the larger effect.

    "The biggest question, some executives said, is how the banishment of amphetamines, announced only last month, will change the game. Players have relied on 'greenies' for decades for an extra boost of energy and intensity. Some players have said they simply cannot get through a six-month, 162-game season without mother's little helper. "

    Aside from greenies incorrectly being labeled "mother's little helper", which I think has traditionally been associated with Valium, I think he makes a pretty good point. Greenies have been used by far more players than steroids and so removing them will likely have a greater effect on the player population as a whole.

    However, since their effects are generally subtler and since there are negative side effects like decreased appetite and disrupted sleep patterns, players will probably adapt by taking legal stimulants or simply taking better care of themselves. And so statistically speaking I doubt we'll see numbers come crashing down.

    One of the hallmarks of humanity is its' ability to adapt and survive. In 1998 some psychologists published a meta-study on childhood sexual abuse. What they found when looking at the subsequent mental health of those who had been abused and comparing it to those who hadn't, was that the difference between the two groups was just two-tenths of a standard deviation. In other words, those had been abused had more problems later in life, but not to the degree that we've been conditioned to expect based on pop-psychology and modern media portrayals of abuse.

    This result triggered a firestorm of controversy which even included the United States Congress passing resolutions which condemned the analysis. Why? Because rather than be comforted by the fact that human beings are a resilient species, they saw it as providing ammunition for those who support pedophilia and other morally reprehensible acts. But we can have both. We can be glad that early traumas don't necessarily mean a wrecked psychological life and at the same time condemn acts that are immoral.

    Thursday, December 15, 2005

    The Sixth Tool

    Here's an interesting blog that purports to be written by an actual scout using a pseudonym. I have my doubts since the setup is too perfect and the writing too good, but it is well-written and interesting. It highlights the differences between traditional scouting where looking for the "Sixth Tool" is paramount and so-called "performance scouting" which, as the scout "Cutter" Jones says, is mostly hogwash

    An analysis of some of the posts was done here.

    Absentee Blogger

    I haven't been writing as much on this blog in the last week because at Compassion we were in the midst of rolling out some software, which I can happily say, was put into production this morning. For those in IT you might find our little project interesting...

    Monday, December 12, 2005

    Schwarz on THT

    Alan Schwarz wrote a nice piece in The New York Times that gave prominent mention to The Hardball Times and one of our fearless leaders Dave Studeman. What I love about the article is that it illustrates how much disparity there still exists between many baseball insiders subjective valuations of players and what performance analysis would indicate.

    "When taken seriously, and asked to assess the victory value of the players they acquired during the industry's annual swap meet, the answers usually sounded like the one Minnesota manager Ron Gardenhire gave when he spoke about his new speedy second baseman, Luis Castillo.

    'He's worth 15 wins, potentially,' Gardenhire said of Castillo, a .293 lifetime hitter acquired from the Florida Marlins. 'We lost 30 one-run games last year. With Luis' ability to get on base, steal bases, score runs and play defense, a guy like that can make a difference in at least half those one-run games going the other way.'"

    Of course those who are familiar with Win Shares know that 15 wins would be the equivalent to 45 Win Shares, a total that Luis Castillo has never and will never approach. Albert Pujols led the majors in WS with 38 last season with Derrek Lee and Alex Rodriguez tied for second at 37. Last season Castillo totalled 17 WS and as Schwarz points out, that doesn't take into account the Win Shares that any replacement level second baseman would contribute.

    It's interesting that the White Sox, a team that you wouldn't think would use performance analysis, use a more analytical approach and estimate that the Jim Thome for Aaron Rowand trade will net them 15 runs (+20 for Thome and -5 for Rowand) or 1.5 wins using the standard 10 runs per win estimate.

    Saturday, December 10, 2005

    Sinister Motives?

    I'd like to thank everyone for their feedback regarding my article on Caribbean players over at THT. I was especially interested in a note from Mike Cook who speculated that perhaps lower plate discipline is a product of coaching philosophy towards Caribbean players who are perhaps viewed as not as intelligent as non-Caribbeans. Another thought that was perhaps Caribbean players develop plate discipline over time since they're not exposed to advanced coaching as early in their careers.

    As to the latter hypothesis I took a look at Caribbeans and non-Caribbeans from ages 20 through 41 and produced the following graph.

    As you can see the difference between the two groups remains pretty steady as both groups age (the sample size gets pretty small for Caribbeans around age 39) and so I think we can discount the second hypothesis. If you're wondering why the walk rate climbs steadily almost to age 40 when generally performance declines as players reach their early thirties, keep in mind that this graph includes all players and so those players who are still playing in their mid to late thirties are "selected" via their better than average performance. In other words, the same set of players is not tracked at each age and so better players are represented on the right end of the graph.

    As for the former hypothesis I wouldn't argue against the notion that Caribbean and Latin players may be viewed as less intelligent, primarily because of the language barrier. Moises Alou said as much during the Krueger controversy. However, since the realization that plate discipline is a skill unto itself is relatively new in baseball, I would argue that since the production of the two groups is basically equivalent, coaches in general see no need to focus on plate discipline with Caribbean players.

    J.C. Bradbury of Sabernomics fame (who has several excellent articles in The Hardball Times Baseball Annual 2006-buy-yours-today) also made an interesting point when he noted that Caribbean players are less likely to be left-handed and more likely to be switch hitters. I admit that I remembered reading his post but forgot completely about it when putting together the article. Indeed, just as J.C. found, in my study 13.2% of Caribbean players were switch hitters and 14.4% left-handed hitters while for non-Caribbeans it was 7.2% and 28.8% respectively.

    Caribbean Count Pct PA Pct
    B 131 0.132 232865 0.215
    R 717 0.724 705547 0.651
    L 143 0.144 146075 0.135
    991 1084487

    Non-Caribbean Count Pct PA Pct
    B 469 0.072 682139 0.116
    R 4154 0.640 3221848 0.546
    L 1868 0.288 1992907 0.338
    6491 5896894

    Overall 11% of the world population (and yours truly) is left-handed and those of Hispanic lineage (which overlaps considerably with my population of Caribbean players) are less likely (9.1%) to be sinister (the Latin word for "left" is sinstre) than dextral ("just" or "right" in Latin). Of course that small difference doesn't explain the much larger difference you see in players from the Caribbean. Many people have pondered this question and it would seem that the cultural bias against lefties in the Caribbean is likely the largest contributing factor. Players forced to write right-handed are therefore more likely to become switch hitters when they start playing baseball. A lesser factor might be related to position bias where more Caribbeans gravitate to middle-infield positions which are traditionally manned by righties.

    Following this line of reasoning J.C. offered that since Caribbean hitters more often hit with the platoon advantage they might walk less and hit with a higher average as a result.

    In a different study I'm doing of platoon advantage I found that hitters with the platoon advantage do indeed walk less frequently (actually .0013 walks per plate appearance less) than do hitters when they don't have the platoon advantage. They also hit for a higher average (+.024) when they have the advantage. In that study, however, I didn't include switch hitters which is very relevant here.

    Using the table above and estimating that 30% of the plate appearances in the majors are against left-handed pitching Caribbean hitters actually hit with the platoon advantage 50.4% of the time while non-Caribbean do so 51.6% of the time. The reason for this is that Caribbeans include 8.4% more "pure" right-handed hitters (with pure in quotes since doubtless many of those are lefties like my Dad who were forced to be right-handed and then never returned from the Dark Side). So it would appear that platoon advantage doesn't really explain the difference.

    Friday, December 09, 2005

    Take a Walk...

    My article on the Krueger/Alou controversy has been posted on THT. It takes a look at the group differences in offensive performance between Caribbean and non-Caribbean players.

    Also had a nice surprise when I found that Rob Neyer mentioned one of my articles in The Hardball Times Baseball Annual 2006 where he says:

    "In 'The Hardball Times Baseball Annual 2006', contributor Dan Fox does something that I've never seen anybody do before: He puts together the wins and losses that we would expect from the run differentials and the underlying events that generally lead to runs scored and allowed.

    I won't present the method in any great detail -- Fox's article runs for several pages, and anyway, you should buy the book (which also contains, among other things, a couple of articles by Bill James and a modest essay by your humble columnist) -- but the results are worth mentioning."

    As I mentioned in the article I can't take credit for the idea since I lifted it from Phil Birnbaum's presentation at this year's SABR conference in Toronto. Phil looked at the role of luck retrospectively and also included pitchers and hitters having outlier or "career years".

    Thursday, December 08, 2005


    I just saw that Tracy Ringsolsby has been honored by the Baseball Writers Association of America with the Spink Award. I met Tracy for the first time last season while scoring for since he and his cowbody hat often occupied the seat next to mine on the front row of the press box. He's always very pleasant and you can tell the other writers and club personel greatly respect him. This honor is well-deserved, heck, helping found Baseball America in and of itself is enough.

    I don't always agree with his takes, especially when it comes to the role and value of performance analysis, but I always appreciate his great writing.

    Here is the AP story.

    DALLAS (AP) - Tracy Ringolsby of the Rocky Mountain News, a pioneer in baseball labor coverage and also among the first writers to concentrate on scouting, won the J.G. Taylor Spink Award on Wednesday from the Baseball Writers'
    Association of America.

    Ringolsby will be honored for meritorious contributions to baseball writing during the induction ceremonies in July at the baseball Hall of Fame in Cooperstown, N.Y. He received 225 votes from BBWAA members to 128 for Joe Goddard of the Chicago Sun-Times and 76 for the late Vern Plagenhoef of Michigan's Booth Newspaper Group.

    Ringolsby has covered baseball for 30 years, 28 as a beat reporter and two as a national writer. He has worked for the Denver paper since 1992, covering the inaugural season of the Colorado Rockies, and previously worked for United Press International, the Press-Telegram in Long Beach, Calif., the Seattle Post-Intelligencer, The Kansas City Star and The Dallas Morning News.
    Ringolsby has worked to have scouts recognized by the Hall of Fame and was also one of the founders of Baseball America, a publication focused on player development and scouting. He covered the grievance hearing of Andy Messersmith and Dave McNally, which led to free agency in baseball.

    Wednesday, December 07, 2005

    Playing the Infield In

    I'm sure if you've watched much baseball you've heard it said that with the infield in a hitter's average goes up 100 points...or 75 points...or something like that. In any case the conventional wisdom is that batting average goes up significantly when the defense is attempting to prevent a critical run from scoring by positioning their infielders close.

    John Walsh's excellent articles on THT prompted me to take a quick look at just how true the conventional wisdom is using play by play data for 2003-2005. The problem is that PBP data does not contain indicators that say "infield was in" on this or that particular play. So we'll have to make some guesses as to when the infield is likely to be in. This too is fraught with difficulty since when you think about it, teams play their defenses in a variety of configurations from double play depth, to in at the corners and DP depth up the middle, in at the corners and half-way up the middle, to all the way in. In fact, in thinking about it I don't have a good feel for how often managers actually bring their infields in - a case of not actually observing what it is you're seeing I suppose.

    In any case I took at stab at identifying those situations where the infield was likely to be in. They were:

  • Runners on third, or second and third, less than 2 outs, with a -3 to 0 run differential in the fifth inning or later

  • Runners on at least first and third, less than 2 outs, with a -2 to 0 run differential in the 8th inning or later

  • In other words I'm assuming that teams don't play the infield in when they have a lead, when they're down by a number of runs, or before the fifth inning, and that they would go for the double play before the 6th inning. In both cases I'm looking at all balls that were put in play excluding bunts.

    So to make the comparison I looked at batted ball outcomes by trajectory in these situations and when these situations didn't apply. First, let's take a look at the non infield-in situations.
    traj        tot     pct     out       s       d       t      hr       h      sf       e
    F 113815 28.3% 73.2% 5.6% 8.1% 1.2% 11.9% 26.8% 3.0% 0.2%
    G 179978 44.7% 76.5% 21.4% 2.0% 0.1% 0.0% 23.5% 0.0% 2.5%
    L 76351 19.0% 26.5% 51.8% 17.6% 1.5% 2.5% 73.5% 0.0% 0.1%
    P 32670 8.1% 98.1% 1.5% 0.4% 0.0% 0.0% 1.9% 0.0% 0.3%
    402814 67.9% 21.1% 6.5% 0.7% 3.8% 32.1% 1.0% 1.2%

    As you can see out of 400,000 batted balls 67.9% were turned into outs. The highest percentage turned into outs were popups and the lowest was line drives.

    Now let's take a look at the outcomes in infield-in situations.
    traj        tot     pct     out       s       d       t      hr       h      sf       e
    F 953 28.4% 76.4% 8.3% 6.4% 1.0% 7.9% 23.6% 60.1% 0.1%
    G 1555 46.4% 73.6% 24.1% 2.2% 0.1% 0.0% 26.4% 0.0% 3.2%
    L 551 16.4% 16.7% 60.4% 18.5% 1.6% 2.7% 83.3% 0.2% 0.0%
    P 295 8.8% 98.0% 2.0% 0.0% 0.0% 0.0% 2.0% 0.7% 0.0%
    3354 67.2% 23.6% 5.9% 0.6% 2.7% 32.8% 17.2% 1.5%

    What's interesting is that the percentage of balls turned into outs is essentially the same, just .7% lower at 67.2%. Interestingly, the line drive hit rate climbs from 73.5% in other situations to 83.3% in infield-in situations. This indicates that the advantage for the hitter with the infield in lies in poking line drives through the drawn in infield. You can also see that the hit rate for ground balls goes up three percent to 26.4% as more hard hit grounders scoot between infielders. This is what you would expect.

    The most surprising aspect of this analysis is that fly balls are converted into outs over 3% more often in infield-in situations than in others. That difference can be explained by the 4% drop in homeruns that accompany the infield-in situations. As Walsh pointed out in his article what is likely going on here is that hitters knowingly sacrifice power in these situations in order to put the ball in play since simply hitting a fly ball gives them over an 80% chance of scoring the runner from third. You can also see this to a lesser degree with line drives where hitters hit fewer line drives with the infield drawn in and more ground balls. Another factor to be considered is that pitchers pitch more carefully with runners on base and so are less likely to challenge hitters with pitches, that when they miss, are driven out of the yard.

    So what about our question? Well, let's take fly balls out of the picture since they are likely governed both by hitter intent and pitcher reticence. When totalling the balls in play on the ground and line drives outcomes, hitters have a .384 chance of getting a hit in other situations and a .413 chance with the infield in - a 30 point difference. That's not as much of a bump as you might suppose.

    However, that assumes that all fly balls are counted against the hitter. In infield-in situations 60% of those flyballs are counted as sacrifice flies and therefore don't count against the hitter's average. This quirk of the rule book that many would like to see eliminated means that a hitter's actual batting average goes up much more significantly. A second factor is that batters do indeed strike out less frequently with the infield-in which raises their batting average as well. Walsh ran the numbers for me and found that overall the batting average is .265 and that in the infield-in situations it's .348, an 83 point difference. When sac flies are counted as outs the average is .300. So over half the 83 point increase is due to the sac fly rule while half is due to the combination of defense alignment (more balls getting through the infield) and batters putting the ball in play more often.

    And that's about what you would expect. Teams play the infield in not because it affords a better chance of getting the batter out, but because it raises the probability that the runner on third will have to remain on third or get thrown out at the plate.

    So the conventional wisdom is superficially correct in that batting average does go up with the infield in. But when you dig a litter deeper you find that hitters don't stand a significantly better overall chance of getting a hit with the infield in.

    Executive Database

    Another little tibit. Baseball America has now published a very cool Executive Database. It contains the members of the Baseball Operations departments of teams since 1960 and general managers going back to 1950.

    Now if they would only give us an XML or csv feed so we could run some interesting GM comparisons...

    Modifying Their Approach

    John Walsh has published a couple nice articles on THT. In the first he looks at whether batters can and so change their approach with a runner on third and less than two outs in order to score the runner on a sacrifice fly. He concludes that:

    " a group batters do seem to be able to change their approach at the plate to increase the probability of getting a fly ball to score a run in a sacrifice fly situation. However, the increase in fly balls comes simply from putting more balls in play (by striking out and walking less often) and not by batters putting more of their batted balls into the air."

    So batters don't really hit more fly balls but they do put the ball in play more often. He also notes that 60% of fly balls in these situations will result in sacrifice flies while less than 1% of line drives do. The major reason for this, as Walsh notes, is that three quarters of all line drives go for hits. A second reason is that a decent percentage of line drives will be caught in the infield. But the biggest reason I think is that based on my own experience, scorers are biased towards recording balls hit to the outfield as fly balls. It's difficult when sitting in the press box to make accurrate determinations in regards to trajectory on many hard hit balls to the outfield and the default position is to record it as a line drive. Or maybe it's just me.

    In the second article Walsh looked at whether it's a good idea for hitters to change their approach in sacrifice fly situations. Although he finds that at first glance hitters are indeed more productive in SF situations using RC they are not (and this is my favorite part and a great insight) when the defensive context is taken into account. To consider the context, Walsh showed that batted balls turn into hits more frequently in SF situations, a result he credits to the "defensive alignment employed by teams with a runner on third and fewer than two outs." I assume that the vast majority of that alignment relates to playing the infield in. The difference Walsh finds is on the order of 1.5% for ground balls and .93% for line drives. Overall batted balls are turned into outs 29.9% of the time in SF situations and 28.3% in non-SF situations. This difference is not as large as I would have thought. Of course, this study doesn't attempt to look only at situations where the infield is in.

    When he then corrects for this hitters in SF situations create .3 runs less per 27 outs in SF situations than in non SF situations. His conclusion is as follows:

    "What I've tried to do here is answer the question 'Is the 'contact-oriented' approach generally more productive than the standard approach?" The answer appears to be 'no,' as can be seen after translating the aggregate performance in sac fly situations into a defense-independent context."

    Thursday, December 01, 2005

    Two Unrelated Things

    I'll be offline for much of the weekend and so wanted to leave you with these two tidbits that have no relationship to one another in any way that I can find.

    First, David Pinto reminded me that the Baseball Hall of Fame has this very cool site where you can view baseball uniforms throughout the ages. Very interesting to take a look back at the bad old days of the late 70s.

    Secondly, one of the things we have always enjoyed about Colorado Springs, and more so since we moved here, is the Flying W Wranglers, a western/gospel band that has a place out here where you can go for dinner and a concert. Well, the group is splitting up as three of the members want to pursue a more openly evangelical approach. The good news is that the Wranglers will go on and will remain a family-friendly place to go.

    Tuesday, November 29, 2005

    Hardball Times Baseball Annual 2006

    The Hardball Times Baseball Annual 2006 has shipped and I received my copies just yesterday. Dave Studeman, Aaron Gleeman, and the folks at ACTA did an outstanding job of putting the book together which contains all original content (no web article reprints) including articles by Bill James, Rob Neyer, John Dewan, and J.C. Bradbury among others and of course a number of articles by the regular THT crew.

    I think you'll really like the additional information on batted ball outcomes and the articles by Studeman, Bradbury and THT's David Glassko that analyzes it. You can read a review of the book here by Stick and Ball Guy.

    Yours truly has two articles in the book, one on lucky and unlucky teams for 2005 and the results of my baserunning analysis. The stats section includes the baserunning results for every player in the majors last season. Enjoy.

    Sunday, November 27, 2005

    Sabermetrics Course

    Thought some readers might be interested in this article, "Numbers crunch: Tufts course really packs 'em in" that discusses the class offered at Tufts titled EX-013, The Analysis of Baseball: Statistics and Sabermetrics and taught by SABR members Andy Andres and David Tybor, and Morgan Melchiorre.

    Very cool indeed. Why couldn't I take that kind of class in college? I'm also proud to report that this site is listed on their links section.

    History Repeating Itself

    SABR's Paul Wendt posted a pdf of a fascinating article by F.C. Lane titled "Has the 'lively' Ball Revolutionized the Game?" The article appeared in the September 1921 issue of Baseball Magazine.

    I've written about F.C. Lane before, a man who was in many ways ahead of his time (particularly in his understanding of performance analysis), and you can read about his fascinating book Batting here.

    As was the case with many in and around the game during this period, Lane grew up when pitching dominated and so it's not surprising that he viewed the exploits of Christy Matthewson, Smokey Joe Wood, and Walter Johnson as normative and was alarmed at the era that Ruth was then ushering in. In this article Lane writes about what he considers the "foremost problem in baseball today."

    He puts it this way.

    "And since we all know that pitching is the bed rock of baseball, when we disturb the foundation of the game, we shake the superstructure."

    Because of his knowledge and reverence for statistics he also rightly compares the inflation of batting statistics to the devaluation of currency during the war - a problem our own generation is now coping with in the "Lively Player Era". The article then explores the various reasons given for the devaluation.

    Lane begins by reporting on his investigation of the ball itself in order to quelch the rumors that it is being manufactured differently and that the owners are responsible - a charge by the way that is recounted by Leonard Koppett in his book The Thinking Fan's Guide to Baseball in chapter 28 where he says that in "1920 the ball was made livlier again...That bit of history is well known." Apparently Lane would have disagreed.

    In any case Lane toured both Reach (the ball used in the AL) and Spalding (that used in the NL) factories and concluded that the balls themselves were manufactured in exactly the same way, albeit with better materials and particularly better quality and more elastic yarn since the end of the war. He also specifically dismissed those who claimed that the balls were being made livlier on purpose, reporting on his interviews with the league presidents. In the end his conclusion on the manufacture of the balls was that:

    "The ball in use in both major leages is actually somewhat livlier than it was during the war period due to better materials and possibly better workmanship. But there is no evidence of any great change in the ball itself from year to year."

    He then went on to discuss the four other factors he saw as also contributing and that in sum were more important than differences in the ball itself.

  • Inferior quality of pitching. He seemed to view this as a random fluctuation effect and quotes Ty Cobb as saying that the pitchers were just having a down year in 1921. In retrospect, the offensive surge that began in 1920 and continued largely until WW II, renders this explanation obsolete. The increase in offense was not random as it might have been in 1987 (probably attributable to weather), but rather was a systemic change in the game. This can be seen graphically here where runs per game jumps in 1920 and doesn't again reach deadball levels until 1968.

  • A general "handicap of pitching" by the new rules. This included the abolition of the spit ball and all "freak deliveries", by which Lane meant meant scuffed, emery, and "shine" balls. These were all banned in the wake of the death of Ray chapman at the hands of a Carl Mays fastball in August of 1920. Of course today we would group all of these in the category of doctored balls but it shows how spitball pitchers were viewed more as craftsman and a legitimate part of the game. As Koppett also notes, Lane mentions that damaged balls had begun being thrown out by the umpires resulting in harder and whiter balls being put in play, a practice that has reached an almost absurd level in the last decade.

  • Changes in managerial methods. Here Lane discusses how managers are adapting to the higher offensive levels by not calling for the sacrifice and stolen base and instead allowing hitters to hit away. This, Lane notes as many sabermetricians have in the last quarter century, leads to increased offensive output and more runs being scored.

  • More "sheer slugging at the ball in an effort to bang out homeruns". Finally, Lane attributes much of the difference to players now trying to hit homeruns or "slug" in the parlance of the day in order to emulate Ruth. Lane also discusses this trend in his book Batting and quotes Ty Cobb as saying:

    "Ruth is more than a slugger, he is a homerun hitter. Fortunately for him, he began as a pitcher. A pitcher is not expected to hit. Therefore, he can follow his own system without managerial interference. Ruth made the most of this opportunity...I have tried to make myself a batter, which is something quite different. A batter is a man who can bunt, place his hits, beat out infield drives, and slug when the occasion demands it, but he doesn't slug all the time."

    To me, and contrary to Koppett who views changes in the ball as most important, this factor is the first among equals. In Batting Lane also attributes Ruth's ability to hit homeruns and others ability to follow his example to adopting a particular style or "speciality" of hitting rather than the influence of different baseballs. The style that Ruth popularized along with the fact that the reluctance of owners to make rules that handicapped Ruth in light of the Black Sox scandal served to usher in the new slugger's era.

  • In conclusion Lane then discussed what he saw as minor contributing factors. He mentioned that it was becoming more common for fans to keep foul balls where once they were forced to return them which had the consequence of introducing many new balls into play. He also noted that the transition to the new pitching rules had caused pitchers to temporarily fall behind in the arms race and that soon they would develop new strategies to cope. The latter introduction of the slider and the resurgence of the knuckleball are two examples of how pitchers did finally adapt. Finally, he also mentions that hitters had grown in confidence from the abolition of the freak deliveries, which in the past had them more fearful of injuries.

    In the end Lane I think correctly attributes the rise in offense to, for him, this unhappy confluence of factors and seems to look forward to the day when pitching would regain prominance. There is a parallel here to our own day where the confluence of factors including weight training, steroids, perhaps lower seems on the ball, the effect of aluminum bats on the training of pitchers, and the abscence of intimidation among others, have all contributed to higher offensive output. And as in those days there are those like Koppett who continue to believe that the ball was made livlier starting in 1993 despite tests that shows the coefficient of restitution hasn't changed.

    Wednesday, November 23, 2005

    Dinner with Rocky Mountain SABR

    Last Saturday night my daughter and I drove into Denver for the annual Rocky Mountain SABR banquet (did I ever mention you should join SABR?). The dinner was held at the Denver Athletic Club and attended by around 25 members and their spouses and a few children. The event is sponsored by the Rockies and included silent and live auctions as well as four speakers with the keynote delivered by Jim Burris, a baseball establishment in the Denver area.

    When I was a member of the Monarchs SABR chapter in Kansas City I was impressed with the level of speakers they were able to get including Allard Baird, Brian McRae, and John Wathan at their annual mid-winter meeting but having a dinner with the same quality was even better.

    The program was led off by Ed Henderson who is the local baseball guru on ESPN radio 560 out of Denver and who was an area scout for the Marlins in the 90s and now with the Pirates. I believe he signed Roy Halladay, Shawn Chacon, and Brad Lidge among others. Henderson spoke eloquently about our shared passion and reminisced about his first trip to Fenway Park and his chance to see Ted Williams - alas a chance he missed as the game was rained out. He also spoke about his trip to the 1999 All Star game and how great it was to see Williams on the field.

    Next up was Rockies radio broadcaster Jack Corrigan. Corrigan spoke on a range of subjects including Irish influence in the game from John McGraw, Connie Mack, and Joe McCarthy to Ed Delahanty and King Kelley. He also reminisced about his chance to see Teddy Ballgame at the end of his career when his dad took him to old Municipal Stadium in Cleveland. After a Williams at bat his dad turned to him and said, "now there's a hitter", and nothing else. Corrigan was also pretty optimistic about the Rockies chances in the next few years and drew a parallel with the Indians of the early 1990s. He thinks the Rockies are doing the right thing in focusing on a youth movement and noted that three of the top nine vote getters in the Rookie of the Year award were Rockies (Garrett Atkins, Clint Barmes, and Jeff Francis). Corrigan was most upbeat about the pitching and noted that in the next few years we'll all be much more familiar with the Latin surnames like Carvajal.

    He also said, however, that the Rockies chances begin and end with Todd Helton. I disagree. Helton's monster contract at $12.6M in 2005 and rising is an albatross for a team that will likely have a payroll less than $60M in 2006. His translated batting line for 2005 of .303/.431/.518 is very good but not worth more than 20% of the payroll, especially at first base where an adequate replacement in Ryan Shealy awaits. Helton has been a wonderful player but the diminishing returns (he'll turn 32 next season) on the contract that pays him an average of $15.7M per year through 2011 indicate that the Rockies would be better off to try and trade him if any takers could be found. The problem with that strategy of course is that he is their only marketable player at the moment until Matt Holliday, Brad Hawpe or someone else becomes a star (don't count on it being Clint Barmes).

    Finally, he mentioned the rumor that the Rockies and Marlins are talking about the possibility of bringing Paul LoDuca to Denver. While at first blush that sounds interesting since the Rockies are in need of catching and the Monforts (the owners) have talked about finding a veteran catcher, there would probably be a tendency to over pay for LoDuca because of his name recognition and it would be foolish to give up some of the young pitching talent for him. In 2006 he'll be five years removed from his outlier year of 2001 where he hit .320/.374/.543. Since then he's been more like .280/.335/.400 which isn't bad for a catcher but nothing to give up the farm for. He'll also be 34. Not a good age for a catcher.

    Next at the podium was the national president of SABR John Zajc. He spoke briefly about the accomplishments of SABR over the past year and how the organization now stands at 6,972 members and is poised to top 7,000 for the second time in its history. He also mentioned that in addition to several retired players who are members there is one active pitcher who is a member whose name I didn't catch.

    At this point member Paul Parker was given the annual award for his contributions to the chapter. Paul is employed by the Rockies and is the Manager of the Community Fields Program and also the Club Historian. He's been instrumental in getting the Rockies support for the chapter which includes assisting with a project to place plaques at the various sites professional baseball was played in the Denver area. The first plaque will go up at the site of old Mile High Stadium where the Denver Bears played for so many years.

    Before Jim Burris took the stage Paul wrapped up the silent and live auctions by auctioning off four signed baseball bats. A signed Cal Ripken bat went for $170, a Mike Schmidt for $160, and a Willie Mays for $700. The final bat was one signed by 30 members of the 1969 Mets which started at $525 but didn't get any takers. All the proceeds from the auction went to The Colorado Rockies Museum and Learning Center. The center will feature exhibits on baseball in the Rockies as well as traveling exhibits from the Baseball Hall of Fame. A learning center is planned as well, so that baseball fans of all ages may learn more about their favorite sport. My daughter bid a few dollars on a set of 2004 Sky Sox cards which she won in the silent auction.

    Burris enjoyed a long career in baseball and was GM of both the Denver Bears and Denver Broncos, assistant to Ford Frick, President of the American Association and Texas League, as well as being a journalist. His talk consisted of a series of anecdotes about individuals he's known including Dizzy Dean, Pee Wee Reese (whom he considered his best friend in the game), Paul Richards (knew more about baseball than anyone else), Carl Hubbell (whom he visited in a home in Mesa Arizona in the 1980s and was able to get some help for), Billy Martin (who managed for him Denver), and Rogers Hornsby (a guy that nobody liked but that he got along with).

    While some of the anecdotes were well known others were more personal. For example, he told of how Martin called him in 1968 and asked if he could throw a party for his players at the close of the season. Burris said that would be fine and he could spend $500. Martin, however, instead of throwing the party after the final game in Indianapolis threw it the night before. Burris was surprised to get a call from the Indianapolis GM who couldn't believe that even with the party and possibly hungover players, the Bears still beat his ballclub 11-2.

    He also told of how he once asked Paul Richards which pitcher he'd like on the mound if he had to win one game. Richards answer was Car Hubbell.

    The night ended too soon but it was great to be among other folks who so enjoy and care about the game.

    Friday, November 18, 2005

    Apply Now

    Earlier this year I wrote an article for THT on "A Day in the Life of a Stringer."

    There I discussed what it was like to work for inputting data for their Gameday system. In the offseason usually lists jobs for stringers and writers on and recently published a nice article to promote it and describe some of what it's like to work in that environment.

    I didn't have any foul ball experiences quite as interesting as Greg's (the closest was about 10 feet from me) but can vouch for the coolness of the job.

    Wednesday, November 16, 2005

    Better Late Than...

    As you probably already know Major League Baseball and the player's association agreed on a tougher drug policy today. In short the policy goes like this (from the SI story).

    Steroid Penalties

    • First positive test -- 50-game suspension, up from 10 days.
    • Second positive test -- 100-game suspension, up from 30 days.
    • Third positive test -- Lifetime ban, with player having right to apply for reinstatement after two years and an arbitrator being able to review reinstatement decision. Under the previous agreement, the earliest a player could be suspended for life was for a fifth positive test.

    Amphetamine Penalties (There was no testing for amphetamines in previous agreement)

    • First positive test -- Mandatory additional testing.
    • Second positive -- 25-game suspension.
    • Third positive -- 80-game suspension.
    • Fourth positive -- Commissioner's discretion, with an arbitrator being able to review. Testing frequency

    What's most interesting about this in my opinion isn't the tougher steroid rules. Those were pretty much a foregone conclusion given that Selig had proposed them in the spring and that congressional action would have resulted had the union not acquiesced.

    As an aside Senator Jim Bunning apparently would still like records to be stricken for those who are caught. Although baseball's statistics are more discrete than those in other team sports, the notion that you could somehow make sense of a statistical record where some players records are excised is nonsensical. Bunning has served his purpose in this debacle and it's time for him to sit down and be satisfied that the pressure he tried to apply worked. After all, there is no way the union would have agreed to this or any other policy had the hearing in March not shown how poorly baseball has dealt with the problem.

    And speaking of baseball's dismal record, that brings me to the amphetamines penalties. If you think baseball has been tardy on steroids, drugs that have plagued the game for a dozen years or so, widespread amphetamine use in the majors goes back at least 40 years and there has never been any testing or penalties. In fact, Selig himself said in a news conference earlier this year "that he first heard about amphetamines when he walked into the Milwaukee Braves' clubhouse in 1958." Talk about your denial. They were banned in Olympic competition over 35 years ago.

    It should be noted that baseball banned the stimulant ephedra after Oriole's pitcher died in 2003 but didn't take the opportunity to add greenies to the list.

    One of the ways I've been mourning the end of the baseball season has been to read. And what I picked up to read is Jim Bouton's Ball Four. I bought a paperback copy some years ago at a book sale and always told myself I'd get around to reading it. As most fans know it was Bouton's book that first openly discussed the use of "greenies" by players. Just tonight I ran across this passage:

    "At dinner Don Mincher, Marty Pattin, and I discussed greenies. They came up because [John] O'Donoghue had just received a season supply of 500. 'They ought to last about a month',I said.

    Mincher was a football player in high school and he said, 'If I had greenies in those days, I'd have been something else.'

    'Minch, how many major league ballplayers do you think take greenies?' I asked. 'Half? More?'

    'Hell, a lot more than half', he said. 'Just about the whole Baltimore team takes them. Most of the Tigers. Most of the guys on this club. And that's just what I know for sure.'"

    Apparently Bunning was asked about the use of greenies in his day (1955-1971) on ESPN radio this morning and he said that he had never seen them in the clubhouse during his playing days. Right.

    I find that hard to believe since Bouton, Bill Lee, Dwight Gooden, Tug McGraw, and David Wells have written about their use while Dale Berra and Dave Parker testified that they received amphetamines from Willie Stargell and Bill Madlock. John Milner even testified he got a stimulant from Willie Mays himself.

    Long overdue is all I can say.

    Tuesday, November 15, 2005

    More Errors

    Since I posted some data on reaching on errors the other day I thought I'd share the 2003-2005 leaders with 10 or more.

    2005 2004 2003
    Jason Kendall 15 Miguel Tejada 16 Ty Wigginton 15
    Freddy Sanchez 13 Ichiro Suzuki 15 Aaron Boone 14
    Jose Reyes 12 Albert Pujols 14 Craig Biggio 13
    Grady Sizemore 11 Derek Jeter 13 Cristian Guzman 12
    Jack Wilson 11 Juan Pierre 13 Miguel Tejada 12
    Carlos Beltran 11 Alex Rodriguez 12 Marquis Grissom 11
    Derek Jeter 11 Brian Roberts 12 Dave Roberts 11
    Chone Figgins 10 Luis Castillo 12 Joe Randa 11
    Craig Biggio 10 Mark Loretta 11 Ken Harvey 11
    Craig Counsell 10 Carl Crawford 11 Kenny Lofton 10
    Adrian Beltre 10 Angel Berroa 11 Jose Vidro 10
    Alfonso Soriano 10 Chipper Jones 11 Juan Pierre 10
    Gary Sheffield 10 Keith Ginter 10 Ichiro Suzuki 10
    Garrett Atkins 10 Jeff Kent 10 Casey Blake 10
    Jose Guillen 10 Edgar Renteria 10 Vinny Castilla 10
    Johnny Damon 10 Ron Belliard 10 Scott Podsednik 10
    Rafael Furcal 10
    Willy Taveras 10

    NL MVP Shenanigans

    I've held off for a long time in discussing the NL MVP race but now that the voting is complete I can go ahead and share my thoughts.

    For those who haven't seen it Albert Pujols won with the following vote totals:

    1st 2nd 3rd Total
    Pujols 18 14 0 378
    Jones 13 17 2 351
    Lee 1 1 30 263

    Overall I don't have any problem with Pujols winning the award over Derrek Lee. After all, Pujols created 142 runs and totaled 38 win shares while Lee created 144 runs and was credited with 37 win shares. They both play first base and Pujols was 3 runs above average while Lee was 14 over. In Wins Above Replacement Player (WARP) Pujols was at 10.7 while Lee was at 12.3. I also have Lee a couple runs better in baserunning than Pujols.

    All told Lee was the better player in terms of creating and preventing runs but Pujols did play on a winning team and that certainly should count for something. Basically, in my view they were close enough to allow the nod to go to Pujols (and because I don't want to upset my friend Jon who is a Cardinals fan).

    What everyone is commenting on of course is that Andruw Jones came in second. I'm not even that concerned that he did but that he received 17 second place votes while Lee received just one is a travesty. There simply isn't a rational justification for that result.

  • Jones created just 90 runs and had 23 win shares.

  • Jones played a more important defensive position but relatively speaking contributed less there as he was +2 in fielding runs above average.

  • Overall his WARP was 7.9, 36% less than Lee.

  • Although I'm not a big believer in the reality of clutch hitting, some voters are and Jones hit poorly in the clutch (.207 with runners in scoring position).

  • In many rankings he barely breaks the top 20 in the National League. In fact you could make the argument that Jeff Francouer was more valuable to the Braves in getting them to playoffs and certainly that Rafael Furcal was (he had a WARP of 8.2 and 27 win shares) or even perhaps John Smoltz (18 win shares).

  • Jones came in second for two reasons - 51 homeruns and potential realized. As for the first, in a day and age where there is so much information available for voters it seems strange that as a group they would be caught like a deer in the headlights staring at one number. It's even more inexplicable since that number, in an age of high offensive totals, is relatively low.

    More importantly, however, I think many voters and fans generally have had very high expectation of Jones since he made a splash in the 1996 post season. They've expected him to win multiple homerun titles and MVPs and have been at a bit of a loss to explain why he hasn't. Therefore with a great sigh of relief it seemed natural when he made a run this season and so they probably viewed him in a more positive light than his actual contribution would dictate.

    Be that as it may, anyway you slice it he simply wasn't the second most valuable player in the league.

    Fishing Expeditions

    I'd like to thank everyone who gave me feedback on my article on matchups at THT and thought I'd take this opportunity to answer the two most frequently asked questions.

    First, several readers pointed out that since I tested over 30,000 outcomes I would expect a certain percentage of those to be in the statistically significant range. In other words, since I expected some low probability outcomes, how can I assign any significance to them in terms of the model not holding? i.e. by concluding that Brian Anderson has some ability to get Garrett Anderson out that the model doesn't capture.

    I admit that I didn't catch the issue behind the question immediately but the questioners make an excellent point. Because I was on a "fishing expedition" as the statisticians say I was likely to find some results that were improbable. As a result we can't conclude that the low p-value matchups are necessarily evidence of some ability of a hitter to mash a particular pitcher or a pitcher to flummox a particular hitter. However, I would say that these low p-value matchups are more likely to be those where the model doesn't hold and so a judicious manager could rightly use that data to make pinch hitting decisions.

    When statisticians go on a fishing expedition like this they often use a more strict standard of proof rather than the typical p value of .05. One technique to lower the standard is to apply the Bonferroni correction. This simple technique says that if we are testing n outcomes instead of a single outcome, we divide our alpha level by n. So instead of looking at .05 we would look at .05/30,481. What that produces however is a really really small p-value which none of the 30,481 matchups were under. In other words, none would be significant under the most conservative correction. In the battle of the Andersons Garrett would have had to have gone 0-33 against Brian in order to reach this level.

    A more liberal application of the Bonferroni correction lowers the p-value to .01 and when that is done there are 133 matchups that fall under this level. Here they are sorted by p-value.

    Hitter             Pitcher            AB   H     Avg  HitAvg PitchAvg  ExAvg    p-value
    Larry Bigbie Andy Pettitte 14 11 0.786 0.276 0.247 0.256 0.000051
    Garret Anderson Brian Anderson 22 0 0.000 0.300 0.299 0.335 0.000127
    Michael Young Brandon Backe 10 9 0.900 0.317 0.267 0.318 0.000235
    Bill Mueller Mike Mussina 23 0 0.000 0.303 0.264 0.301 0.000269
    Marcus Giles Jason Schmidt 14 10 0.714 0.305 0.214 0.248 0.000318
    Preston Wilson Jae Seo 6 6 1.000 0.268 0.270 0.271 0.000395
    Preston Wilson Byung-Hyun Kim 10 8 0.800 0.268 0.253 0.254 0.000465
    Enrique Wilson Pedro Martinez 13 8 0.615 0.214 0.219 0.174 0.000465
    Jose Reyes Jon Lieber 13 10 0.769 0.277 0.280 0.292 0.000505
    Mark Grudzielanek Tim Hudson 6 6 1.000 0.304 0.250 0.286 0.000547
    Derrek Lee Mark Mulder 15 11 0.733 0.295 0.266 0.294 0.000558
    Matt Holliday Woody Williams 6 6 1.000 0.299 0.263 0.296 0.000671
    Aubrey Huff Jon Lieber 13 10 0.769 0.290 0.280 0.305 0.000756
    Todd Helton Damian Moss 7 7 1.000 0.343 0.288 0.367 0.000897
    Clint Barmes Odalis Perez 12 9 0.750 0.289 0.259 0.282 0.001022
    Reggie Sanders David Weathers 5 5 1.000 0.272 0.260 0.266 0.001327
    David Bell Gary Majewski 7 6 0.857 0.253 0.264 0.251 0.001365
    Charles Johnson Jake Peavy 6 5 0.833 0.230 0.230 0.197 0.001499
    Rondell White Jake Westbrook 19 0 0.000 0.289 0.265 0.288 0.001588
    David Dellucci Kevin Brown 14 9 0.643 0.242 0.265 0.241 0.001607
    Mark Kotsay Jamie Moyer 21 13 0.619 0.288 0.267 0.288 0.001621
    Alfonso Soriano John Lackey 26 1 0.038 0.280 0.271 0.285 0.001876
    Adrian Beltre Dontrelle Willi 9 7 0.778 0.277 0.254 0.264 0.001923
    Brad Wilkerson Mike Matthews 7 6 0.857 0.257 0.278 0.268 0.001981
    Matt LeCroy Nate Robertson 15 10 0.667 0.273 0.274 0.280 0.002075
    Mark Sweeney Adam Eaton 11 8 0.727 0.277 0.261 0.271 0.002131
    Jermaine Dye Jarrod Washburn 24 13 0.542 0.253 0.266 0.252 0.002278
    Aaron Rowand Tim Wakefield 9 7 0.778 0.288 0.251 0.272 0.002322
    David Ortiz Bartolo Colon 18 0 0.000 0.297 0.255 0.285 0.002385
    Jeff Cirillo Javier Vazquez 6 5 0.833 0.234 0.250 0.218 0.002427
    Kevin Millar Jorge Sosa 9 7 0.778 0.282 0.259 0.274 0.002432
    Rocco Baldelli Jake Westbrook 13 9 0.692 0.285 0.265 0.283 0.002604
    Carlos Lee Jeff Suppan 17 0 0.000 0.287 0.271 0.291 0.002875
    Frank Catalanotto Mike Timlin 7 6 0.857 0.298 0.257 0.288 0.003029
    Frank Catalanotto Dan Wright 5 5 1.000 0.298 0.285 0.318 0.003231
    Hideki Matsui Aaron Sele 14 0 0.000 0.297 0.303 0.336 0.003269
    Mike Lowell Rheal Cormier 6 5 0.833 0.270 0.230 0.234 0.003355
    Omar Vizquel Javier Vazquez 12 8 0.667 0.274 0.250 0.257 0.003376
    Bobby Abreu Mike Hampton 28 2 0.071 0.296 0.273 0.303 0.003453
    Ivan Rodriguez Jon Garland 16 0 0.000 0.303 0.262 0.298 0.003507
    Chone Figgins Esteban Loaiza 11 8 0.727 0.293 0.265 0.292 0.003530
    Shane Halter Eddie Guardado 5 4 0.800 0.213 0.215 0.169 0.003561
    Carl Crawford Mark Buehrle 7 6 0.857 0.293 0.271 0.297 0.003578
    Miguel Cabrera Steve Trachsel 13 9 0.692 0.300 0.263 0.296 0.003686
    Orlando Cabrera Jae-Weong Seo 16 10 0.625 0.274 0.270 0.277 0.003758
    Tony Graffanino Brian Anderson 21 1 0.048 0.281 0.299 0.315 0.003822
    Aubrey Huff Bronson Arroyo 16 10 0.625 0.290 0.254 0.277 0.003826
    Frank Catalanotto Ryan Franklin 7 6 0.857 0.298 0.272 0.304 0.004070
    Edgar Renteria Sean Burnett 5 5 1.000 0.297 0.301 0.334 0.004139
    Brian Buchanan Jason Schmidt 7 5 0.714 0.244 0.214 0.195 0.004176
    Edgar Renteria Gustavo Chacin 9 7 0.778 0.297 0.268 0.299 0.004218
    Hideki Matsui John Parrish 12 8 0.667 0.297 0.238 0.266 0.004280
    Kevin Mench Ryan Franklin 16 10 0.625 0.276 0.272 0.281 0.004284
    D'Angelo Jimenez Brian Anderson 9 7 0.778 0.268 0.299 0.300 0.004306
    Jason Bay Jeff Francis 5 5 1.000 0.295 0.307 0.338 0.004415
    Rafael Palmeiro Orlando Hernand 8 6 0.750 0.261 0.258 0.252 0.004427
    Khalil Greene Dustin Hermanso 6 5 0.833 0.259 0.256 0.249 0.004512
    Lew Ford Rafael Betancou 8 6 0.750 0.285 0.236 0.253 0.004515
    Edgar Renteria Brian Lawrence 9 7 0.778 0.297 0.272 0.304 0.004637
    Jack Wilson Kazuhisa Ishii 8 6 0.750 0.275 0.246 0.254 0.004648
    Craig Monroe Brad Radke 21 12 0.571 0.271 0.276 0.280 0.004808
    Eric Chavez Roy Halladay 8 6 0.750 0.275 0.248 0.256 0.004840
    Michael Tucker Derek Lowe 10 7 0.700 0.254 0.276 0.264 0.004894
    Jose Hernandez Randy Johnson 12 7 0.583 0.241 0.232 0.208 0.004994
    Adrian Beltre Ismael Valdez 7 6 0.857 0.277 0.305 0.317 0.005174
    Jonny Gomes Bartolo Colon 6 5 0.833 0.268 0.255 0.257 0.005274
    Mark Loretta Kirk Rueter 27 3 0.111 0.314 0.298 0.349 0.005321
    Brian Roberts Mike Mussina 26 14 0.538 0.286 0.264 0.283 0.005446
    Miguel Tejada Justin Miller 5 5 1.000 0.298 0.319 0.354 0.005562
    Adrian Beltre Javier Vazquez 6 5 0.833 0.277 0.250 0.260 0.005580
    Brad Hawpe Duaner Sanchez 6 5 0.833 0.259 0.268 0.260 0.005633
    Reed Johnson Johan Santana 12 7 0.583 0.277 0.205 0.214 0.005819
    Doug Glanville Ramon Ortiz 6 5 0.833 0.240 0.291 0.263 0.005865
    Geoff Blum Jeff Suppan 11 7 0.636 0.237 0.271 0.242 0.006177
    Tony Clark Jeff Weaver 6 5 0.833 0.258 0.276 0.267 0.006338
    Jimmy Rollins Salomon Torres 6 5 0.833 0.281 0.254 0.268 0.006427
    Jose Guillen Pedro Martinez 11 7 0.636 0.295 0.219 0.245 0.006643
    Jorge Posada Curt Schilling 13 8 0.615 0.271 0.252 0.256 0.006658
    Travis Hafner Bartolo Colon 15 0 0.000 0.295 0.255 0.284 0.006699
    Kevin Millar Rick Bauer 6 5 0.833 0.282 0.256 0.271 0.006768
    Eric Byrnes John Halama 6 5 0.833 0.260 0.280 0.274 0.007111
    Marlon Byrd Greg Maddux 8 6 0.750 0.271 0.271 0.275 0.007151
    Keith Ginter Juan Cruz 9 6 0.667 0.244 0.257 0.235 0.007267
    Pat Burrell Horacio Ramirez 16 9 0.563 0.249 0.266 0.249 0.007326
    Damion Easley Kevin Millwood 7 5 0.714 0.229 0.257 0.221 0.007326
    Eric Chavez Horacio Ramirez 6 5 0.833 0.275 0.266 0.275 0.007328
    Carlos Delgado Jorge Julio 6 5 0.833 0.292 0.251 0.276 0.007364
    Eddie Perez Randy Johnson 7 5 0.714 0.254 0.232 0.221 0.007378
    Gregg Zaun Jon Garland 11 7 0.636 0.254 0.262 0.249 0.007396
    Aramis Ramirez Garrett Stephen 10 7 0.700 0.296 0.254 0.283 0.007423
    Cody McKay Dan Miceli 5 4 0.800 0.230 0.240 0.206 0.007520
    Ichiro Suzuki Mark Buehrle 19 12 0.632 0.330 0.271 0.334 0.007638
    Aramis Ramirez Tony Armas 6 5 0.833 0.296 0.249 0.278 0.007644
    Alex Rodriguez Jamie Moyer 23 13 0.565 0.302 0.267 0.302 0.007645
    A.J. Pierzynski Mike Mussina 6 5 0.833 0.281 0.264 0.278 0.007692
    Alfonso Soriano Erik Bedard 6 5 0.833 0.280 0.265 0.278 0.007706
    Abraham Nunez Josh Fogg 6 5 0.833 0.256 0.289 0.278 0.007711
    Scott Rolen Brett Myers 10 7 0.700 0.289 0.262 0.285 0.007741
    Eric Chavez Jeff Weaver 10 7 0.700 0.275 0.276 0.285 0.007749
    Luis Gonzalez Jim Brower 8 6 0.750 0.280 0.266 0.280 0.007762
    Corey Koskie Adam Bernero 6 5 0.833 0.266 0.280 0.279 0.007822
    Adam Kennedy Esteban Loaiza 8 6 0.750 0.282 0.265 0.280 0.007877
    Edgar Renteria Rodrigo Lopez 13 0 0.000 0.297 0.279 0.311 0.007885
    Rob Mackowiak Chris Carpenter 20 10 0.500 0.261 0.237 0.231 0.007902
    Carlos Lee Brian Anderson 33 4 0.121 0.287 0.299 0.320 0.007963
    Adrian Beltre Jae-Weong Seo 6 5 0.833 0.277 0.270 0.280 0.007975
    Paul Konerko John Halama 6 5 0.833 0.267 0.280 0.281 0.008061
    Antonio Perez Jason Schmidt 7 5 0.714 0.280 0.214 0.226 0.008093
    Mark Bellhorn Jose Contreras 5 4 0.800 0.239 0.236 0.210 0.008115
    Russell Branyan Kip Wells 7 5 0.714 0.237 0.255 0.226 0.008180
    Rod Barajas Bartolo Colon 18 0 0.000 0.244 0.255 0.234 0.008328
    Jim Edmonds Jason Jennings 13 0 0.000 0.280 0.293 0.308 0.008382
    David Newhan Jon Lieber 6 5 0.833 0.271 0.280 0.285 0.008564
    Craig Counsell Scott Linebrink 5 4 0.800 0.246 0.232 0.214 0.008624
    Carlos Zambrano Chris Carpenter 7 5 0.714 0.258 0.237 0.229 0.008638
    Geoff Jenkins Tim Redding 14 9 0.643 0.283 0.285 0.302 0.008699
    Miguel Olivo Ted Lilly 5 4 0.800 0.229 0.249 0.214 0.008702
    Xavier Nady Kerry Wood 5 4 0.800 0.262 0.219 0.215 0.008885
    Alfonso Soriano Kevin Brown 15 9 0.600 0.280 0.265 0.278 0.008916
    Pat Burrell Aaron Cook 8 6 0.750 0.249 0.306 0.287 0.008929
    Royce Clayton Rick White 6 5 0.833 0.260 0.294 0.288 0.009009
    Fernando Vina Matt Clement 5 4 0.800 0.243 0.239 0.217 0.009213
    J.T. Snow Kevin Brown 6 5 0.833 0.291 0.265 0.289 0.009247
    Kevin Millar Scot Shields 9 6 0.667 0.282 0.232 0.246 0.009283
    Juan Pierre Roy Oswalt 10 7 0.700 0.303 0.258 0.294 0.009315
    Richard Hidalgo Mike Mussina 11 7 0.636 0.262 0.264 0.259 0.009398
    Ramon Martinez Scott Linebrink 7 5 0.714 0.268 0.232 0.233 0.009438
    Miguel Cabrera Adam Eaton 10 7 0.700 0.300 0.261 0.294 0.009463
    Casey Blake Wade Miller 7 5 0.714 0.257 0.244 0.235 0.009794
    Brad Ausmus Mike Remlinger 5 4 0.800 0.244 0.242 0.221 0.009841
    Phil Nevin Darren Oliver 6 5 0.833 0.270 0.290 0.293 0.009866
    John Mabry Brandon Webb 7 5 0.714 0.258 0.244 0.236 0.009882
    Mark Teixeira Aaron Sele 12 0 0.000 0.282 0.303 0.319 0.009886

    Once again we can never say for sure that these are statistically significant (e.g. they represent something not represented by the model) but they are much more likely to be so.

    The second question was related to why I thought that certain pitchers did well againt certain pitchers and vice versa. This question sprang from the realization that Brian Anderson made the top 25 list in the article for lowest hit matchups three times, once each with Garrett Anderson, Tony Graffanino, and Carlos Lee. At first glance one might think it had something to do with platoon effects but of course neither Lee nor Graffanino bat left-handed. Any in any case Anderson doesn't have large split differences (.282/.325/.484 vs lefties the last three years and .304/.341/.510 vs righties).

    If indeed these matchups are significant it tells me that Anderson likely has something in his delivery that gives certain hitters trouble. For example, his arm angle may be difficult for some hitters to pick up. Or there may be something in his repertoire might be hard for hitters who hit certain pitches well to deal with. I looked at all of Anderson's matchups (116 or so) and didn't really see much of a pattern that I could discern and so I'm still somewhat at a loss to explain it if indeed it isn't simply randomness.

    If anyone has better ideas I'm all ears.

    Monday, November 14, 2005

    Reaching on Error

    Awhile back I wrote a post about Willie Wilson and reaching on errors. There I called into question John Miller's recollection that Bill James once wrote that Wilson had reached base on errors 31 times in a season. I doubted the statement since the most I could find in the period 2000-2004 was 16 by Miguel Tejada in 2004.

    Well, I've now loaded all 3.7 million records of play by play data for 1970-1992 and ran the following query to see who had reached base the most on errors in individual seasons. The winners with 15 or more ROEs in the period were...

    1985 Wally Backman 26
    1985 Bob Meacham 26
    1987 Wally Backman 22
    1986 Willie McGee 21
    1977 Bert Campaneris 20
    1979 Jack Clark 20
    1986 Carney Lansford 20
    1984 Al Wiggins 20
    1990 Mariano Duncan 20
    1975 Dave Cash 19
    1974 Lou Brock 18
    1973 Pete Rose 18
    1975 Thurman Munson 18
    1975 Claudell Washington 18
    1982 Garry Templeton 18
    1985 Dan Gladden 18
    1986 Dan Gladden 18
    1982 Rafael Ramirez 17
    1983 Garry Templeton 17
    1975 Ralph Garr 17
    1977 Bill Almon 17
    1979 Al Cowens 17
    1973 Mickey Stanley 17
    1971 Cesar Tovar 17
    1984 Steve Sax 17
    1991 Cal Ripken 17
    1970 Mickey Stanley 16
    1971 Sandy Alomar 16
    1974 Larry Bowa 16
    1972 Don Money 16
    1978 Larry Bowa 16
    1976 Bill Russell 16
    1982 Pedro Guerrero 16
    1980 Willie Wilson 16
    1989 Dan Gladden 16
    1985 Mariano Duncan 16
    1986 Mariano Duncan 16
    1988 Dan Gladden 16
    1989 Roberto Alomar 16
    1982 Paul Molitor 15
    1983 John Castino 15
    1983 Tim Wallach 15
    1984 Julio Franco 15
    1983 Ryne Sandberg 15
    1977 Gary Matthews 15
    1977 Amos Otis 15
    1975 Felix Millan 15
    1975 Manny Trillo 15
    1973 Tommy Harper 15
    1975 Bucky Dent 15
    1974 Len Randle 15
    1974 Rennie Stennett 15
    1970 Aurelio Rodriguez 15
    1986 Bob Horner 15
    1985 Tom Herr 15
    1985 Glenn Hubbard 15
    1984 Ryne Sandberg 15
    1989 Ricky Jordan 15
    1990 Shawon Dunston 15
    1990 Joe Carter 15

    What I get out of this list is that reaching on errors contains both a component of speed as the list is dominated by speedsters but also an element of luck (Larry Bowa made the list twice after all) mixed with right handed hitters who pulled the ball in the hole (Horner, Carter, Clark, Guerrero). Dan Gladden makes the list four times (85,86,88,89).

    You'll also notice that Wilson's best season was 1980 when he reached 16 times on errors.