FREE hit counter and Internet traffic statistics from

Thursday, September 30, 2004

Reds = Spoilers

I'm to depressed to write much about the Cubs tonight. Once more a lack of offensive production spoiled a beautiful 16-strikeout pitching performance by Mark Prior and Sammy Sosa tying Harmon Killebrew at 573 for 7th place. Adam Dunn also set the single-season strikeout mark and now has 191. Three extra-inning losses in the last week have been devastating.

With the Giants set to win tonight the Cubs will need to sweep the Braves this weekend and have both the Giants and Astros lose at least one game. In the event of a tie the playoff game will not be played at Wrigley I understand but I'm not sure who would have to play the extra game.

I spent an hour the other day updating the Big League Pocket Manager application per a suggestion from Jon Box. I've added a grid control that shows all of the strategy options simultaneously. Options that aren't viable because of the runners or outs situation are simply left blank. When a strategy actually lowers the odds a "No" is placed in the column. Here is a typical example.

Wednesday, September 29, 2004

Royals Make History

Not the kind of history you want but the 5-2 loss to the Indians tonight was the 101st loss of the season, setting a new team record and surpassing the 2002 squad's 62-100 finish.

On a lighter note I'm told that during the ballgame the Indians veteran players absconded with the rookies street clothes and replaced them with various costumes - ballerinas, cheerleaders, and so forth. After the game the rookies were instructed to change into their new outfits and head outside to sign autographs. It's their last roadtrip of the season.

Struggling Royals

I'm scoring the Royals/Indians game for tonight and note the following struggling Royals hitters:

  • Angel Berroa is 4 for his last 33
  • Dee Brown is 2 for his last 24
  • Ken Harvey only has 1 extra base hit in his last 73 at bats dating back to August 7th
  • Abraham Nunez is 0 for 15 and 2 for his last 27
  • Desi Relaford is 0 for 14 and 2 for his last 39

On the plus side of the ledger David DeJesus and John Buck continue to hit and Calvin Pickering does what he does, hit homeruns, walk, and strikeout. Tonight Mike Wood is keeping the ball down but has given up a few hits including a 420 foot homerun to Ben Broussard. Indians 2 Royals 2, bottom of the 4th.

Slipping Away?

Well, another disappointing day at the Friendly Confines. Once again LaTroy Hawkins couldn't close the deal and blew his 9th save opportunity in 33 tries. Overall, though the pitching was great - you simply have to win games when you give up only 4 runs in 12 innings. The story of the game was lost opportunities by the offense - a theme to this year's edition of the Cubs. Bases loaded chances in the 3rd and 7th yielded only a single run and a first and third with nobody out in the 12th only brought home one run when Moises Alou grounded into a double play. Today's game and the 4-3 loss to the Mets over the weekend will be the two games that the Cubs look back on if they can't pull it together.

Yesterday Steve Stone commented that the Cubs lineup was formidable. He's mistaken. For all their power (227 homeruns) they're only 6th in runs scored at 765. To put that in perspective they are 2nd in the history of the major leagues in homeruns as a percentage of runs with 29.6%. The top 10 before this season...

Year Team HR   R HR/R

1964 MIN 221 737 .300
1957 KC1 166 563 .295
2001 SFN 235 799 .294
1963 MIN 225 767 .293
1961 NYA 240 827 .290
1987 CHN 209 720 .290
1987 BAL 211 729 .289
2003 TEX 239 826 .289
1956 CIN 221 775 .285
1997 SEA 264 925 .285

The reason of course is that they are 14th in walks with only 473. This accounts for the fact that they've hit 135 solo homeruns and that over 40% of their runs score on homeruns. This one-dimensional offensive team simply cannot score runs without the aid of a homerun.

With four games to go the Cubs now need some help from the Padres and Cardinals tonight since they've relinquished their place in the driver's seat. Roger Clemens goes for the Astros against Jeff Suppan while Noah Lowry goes for the Giants against David Wells. Lowry, especially has pitched well recently. In his last three starts he's given up only 3 runs in 22.67 innings. Clemens has also thrown 15.33 scoreless innings although coming off of a 120 pitch effort, his second longest of the season, against the Brewers. The Cubs could easily find themselves in 3rd in the wildcard race tomorrow morning...

Pitch Counts Again

Ran into this interesting discussion on pitch counts and Dusty Baker in the wake of the 124 pitch outing by Carlos Zambrano the other day against the Reds. In Zambrano's last seven starts he's gone 115, 125, 119, 103, 112, 117, 124 - a hefty workload for any pitcher, let alone a 23-year old who you'll be counting on in the years ahead.

I’ve done a bit of analysis on this question in the past as well and tend to believe that it isn't just the number of pitches that's the issue, for example, there is no magic to the number 100. However, running a pitcher out there for innings when he is already fatigued leads to a higher likelihood of injuries as his mechanics break down.

Something that struck me when reading this was the response to the statement that pitcher's in the past had always thrown lots of pitches.

Back in The Good Old Days, if a pitcher blew out his arm, it was back to the farm or the factory for him, and he was forgotten. Nowadays, there's a good chance that modern medicine can fix him up and get him back on the mound. So there's been a change in the way in which pitchers are looked at. Instead of being thought of as disposable commodities, they're thought of more as scarce resources, and more and more effort and thought is put into managing those resources.

To me, this rings true. Just take the case of the Cubs Ryan Dempster. Dempster is a guy who was injured in 2002 at the age of 24 and is just coming back from Tommy John surgery. When you look at his history it's not hard to understand how that happened.

Age Year Innings

18 1995 40
19 1996 170.67
20 1997 165.33
21 1998 132.67
22 1999 147
23 2000 226.33
24 2001 211.33
That's a pretty big workload for such a young pitcher, almost 1,100 innings before the age of 25. And this especially for a pitcher that both strikes out and walks an above average number of hitters. Dempster's Batters Faced Per Start (BFS) in 2000 and 2001 was 29.5 and 28.1, fairly high totals. He also averaged over 105 pitches per start in those two seasons and 99 as a 23-year old. Had the Marlins been a bit more cautious with Dempster he may not have had the inury problems he's had.

One Year Anniversary

One year ago today I wrote my first blog post on this site and this is the 366th post. Wondering how many hours I spent in the last year....

Tuesday, September 28, 2004

Pocket PC Virus

Here's an interesting series of articles on the "first" Pocket PC virus WinCE4.Dust from its inventor. Here's another recent article on the same topic that discusses a Trojan infecting Pocket PCs.

Here's a short snippet from our book on this topic included in chapter 9, "Securing Compact Framework Solutions"...

Even though viruses that target mobile devices are not as prevelant as those targeting desktop computers, they are still a potential threat to any software on the device. Typically, however, devices are not damaged by viruses but rather pass them into a corporate network via email attachments and documents.

There are, as you might expect, a variety of anti-virus packages on the market for devices such as the Pocket PC from vendors that include McAfee, Computer Associates, and F-Secure. Additionally personal firewall products such as VPN-1 are available from vendors such as Check Point Software Technologies Ltd.

If anti-virus and third-party authentication software is installed on the device it is a best practice to place the sofware in flash ROM rather than simply in RAM on the device. In this way, the software will survive a hard reset in which the user removes the batteries. However, storing your application in RAM can also work in your favor since the application will be lost when performing a hard reset, thereby disallowing access to it.

Caching Data with SQL Server CE

Just found out that the Windows IT Library has posted a chapter from our book. Here's the abstract from the chapter:

In this chapter from "Building Solutions with the Microsoft .NET Compact Framework: Architecture and Best Practices for Mobile Development," you'll learn about the role of Microsoft SQL Server 2000 Windows CE Edition (SQL Server CE) in the enterprise. You'll also find out about robust data caching, administering SQL Server CE, and SQL Server CE architecture.

Sunday, September 26, 2004

Cubs Struggle in New York

Well, the Cubs didn't fare too well in their visit to New York. After a terribly disappointing defeat on Saturday in 11 innings 4-3, they again lost a one-run game 3-2 today. However, with the Giants loss today they remain a half-game up on the Giants and in the all-important loss column. Their destiny is still in their hands heading into the next series with the Reds.

Not much to say about Saturday's game although I was a bit surprised that Dusty brought in Ryan Dempster in the 8th inning. To me Dempster hasn't shown that he's anywhere near his 2000-2001 form. In addition, he has always had a tendency to walk batters (over 4.5 per 9 innings career) and so his high number of walks - now 12 in 17 and 2/3 innings - is not really suprising. That makes him particularly unsuited to a relief role in tight games. On Saturday he walked 2 batters both of whom scored on the Diaz homerun off of LaTroy Hawkins. Hawkins is now just 24 of 32 in save situations, a horrible percentage. I've never been of the belief that some guys are "setup men" and some are "closers" but Hawkins' varying performance in those roles has got me wondering...

Of course, the game was actually lost earlier when Sammy Sosa left 8 men on base striking out 4 times and grounding into a double-play. He seems to have lost all confidence. I was hoping Dusty would pinch hit for him in the late innings but of course Dusty wouldn't do that to a veteran. At this point I'd prefer to see Ben Grieve in right field for a couple days (although he looked terrible in striking out today by kind of swinging at three pitches right down the middle). Either way Sosa needs to move up on the plate and be aggressive on the inside pitch. He falls away from everything middle-in pulling off the ball and of course can't hit anything on the outer half. I think you can trace this back to when he was beaned for the second time last year by the Expos Zach Day. Ever since then he flinches more on inside pitches and doesn't attack them. Normal aging has also played a role no doubt and his bat is slower now than a couple years ago.

Today's game was simply dull. Kerry Wood looked great after a first inning where he hit 2 batters and walked who knows how many to hand the Mets 3 runs. The offense once again looked pathetic managing just three hits. At least Sammy didn't strike out and even drove in a run with a weak ground ball. The most controversial decision of the day was that to allow Wood to bat for himself in the 7th inning trailing by a run. I think it was an ok decision given that Wood is a decent hitter and the state of the bullpen.

7 games to go....

Saturday, September 25, 2004

Cubs and Royals 9/24

The Cubs extended their lead in the Wild Card race with an exciting 2-1 victory over the Mets last night as the Giants lost to the Dodgers 3-2 despite homerun number 702 from Barry Bonds.

I was in Indianapolis all day for work and so taped the game and watched it late last night. Glendon Rusch pitched 6 solid innings, Kyle Farnsworth looked a bit shaky in the 7th, John Leceister was very solid in the 8th and 9th as was Mike Remlinger, and LaTroy Hawkins pitched a pretty good 10th to pick up the save. Kurt Benson for the Mets pitched very well and had his slider working. Umpire Dale Scott had a pretty big outside corner that Benson used well. In fact, Benson's only real mistake was a hanging slider to Aramis Ramirez in the 7th that wound up in the left-field seats to tie the game. In the 10th, Derrek Lee singled in Mark Grudzielanek who had walked. Lee then stole second off Brendon Looper/Mike Piazza on a pitch out and was later thrown out at third on an ill-conceived stolen base attempt. Other Notes:

  • Piazza looked terrible behind the plate in the late innings (giving up a wild pitch in the 9th that almost cost the game) after playing first most of the game. Sammy Sosa also looked confused at the plate, fouling off the 2 or 3 good pitches he got and either striking out or weakly grounding out.
  • Yesterday's lineup was what I consider the best lineup the Cubs can put on the field and it was the first time I remember seeing it. I hope Dusty rides it for the last 9 games.

    CF Corey Patterson
    1B Lee
    3B Ramirez
    LF Moises Alou
    RF Sammy Sosa
    SS Nomar Garciapara
    2B Todd Walker
    C Michael Barrett
  • Patterson is also once again struggling at the plate and seems to have lost his plate discipline. He's getting down in the count early by swinging at the first two pitches almost every at bat.
  • Today the Cubs will go with Mark Prior against Aaron Heilman, starting only his 4th game of the season. He pitched well last time out against the Pirates on 9/19.

On my way home from the airport I listened to the last inning of the Royals 8-6 victory over the White Sox. John Buck hit two homeruns and a single to raise his season average to .243. When Buck first came over from the Astros he looked totally overmatched at the plate. Apparently, Jeff Pentland worked with him to not be so much of an arm swinger and that has helped. Since August 1st Buck has hit .286 (39-136) with a .647 slugging percentage (8 doubles and 11 homeruns). He still has no plate discipline (7 walks and 40 strikeouts) but he's making better contact and he's a strong guy who'll get his share of homeruns. You have to figure that he'll be the starting catcher when spring training starts and that Benito Santiago will take a backup role if he's still with the club.

On another front Calvin Pickering is performing as advertised (.271/.531/.357) and has driven in 22 runs in 27 games. Since the Royals signed Matt Stairs to a contract next season that appears to leave Pickering out in the cold unless they can deal Ken Harvey or Mike Sweeney.

The most inexplicable moment of the game came in the 9th inning when the White Sox had already scored 2 runs off of Mike MacDougal to come to within 8-6. With runners on first and second and still nobody out Aaron Rowand (.317/.550/.369) was asked to bunt by manager Ozzie Guillen. I'm certain it was Guillen's choice since Rowand squared around the pitch before he actually bunted as the Royals announcers mentioned. I'm not sure where Guillen is from but on this planet anyway, that's just plain wrong wrong wrong. The run potential in that situation is 1.573. If Rowand is successful it goes down to 1.467 and only raises the scoring probability by 5.4% (to 69.5%). And that's with an average hitter at the plate - Rowand is far above average. It only gets worse when you consider that a lesser hitter, Juan Uribe (.276/.496/.322), was on deck. At best, the bunt in that situation is a one-run strategy (the break-even percentage to score one run is 79.9%).

Although I haven't watched many White Sox games this season (I try to avoid their announcer "Hawk" Harrelson, see this post for a discussion of why), this sounds like what Baseball Prospectus described in its essay on the White Sox.

"It is natural to assume that, as a field manager, Guillen will prefer the same style of play that he practiced. And Ozzie Guillen, though a smart, eloquent man, was one wicked stupid baseball player, running reckless on the basepaths, swinging at any pitch within a foot of the plate. That approach, of course, would be disastrous for the Pale Hose, whose aging collection of sluggers is far more South-side Hitman than Go-Go Sock."

And there is some evidence that Guillen has led the Sox down his own path. Although tied for 2nd in baseball with 222 homeruns and 5th in runs scored with 818, they are 21st in walks (475) and first in the AL in sacrifice hits with 57. They've also stolen 76 bases but been caught 46 times for a very unimpressive 61.3%.

Thursday, September 23, 2004

Ichiro and Sisler

Here are a couple of interesting posts from David Pinto over on Baseball Musings. He's been giving odds of Ichiro Suzuki breaking George Sisler's single-season hits record of 257. With 9 hits in the last two days Ichiro's odds have skyrocketed to 97% based on this year's performance and 92% based on his career numbers. On the high end David has him getting 23 more hits which would give him 270.

What's garnered interest in the Ichiro watch the last few days has been the critical article on CBS Sportsline. There are some interesting quotes there regarding how Bob Melvin attempted to instill a little more plate discipline in Ichiro coming out of spring training. After a slow start in April (26 for 102, .255) Melvin took the reigns off so to speak. David points out that Ichiro's slow start could simply have been accounted for by chance since he also hit just .274 in June. When you look at the splits though Ichiro only walked 8 times in April (3.91 pitches per plate appearance), 7 times in May (3.17 p/pa), 10 times in June (3.78 p/pa), and 6 times in July (3.33 p/pa). This doesn't seem like much of a difference over the first four months and although he did see more pitches in April it didn't really result in many more walks. With Ichiro you'd also want to know how many times he bunted on the first pitch during each month which would tend to skew his numbers.

It's also interesting to note that his ground out to fly out ratio has shot up this season to 2.33. In previous seasons its been in the range of 1.3 to 1.7. That might indicate that he has changed his style to slap even more than he did previously (I don't see enough Mariners game to know). In one of his at bats two nights ago I did notice that he slapped a ground ball to the right side where the second baseman fielded it just on the grass and Ichiro still beat the play by a step. When you have that kind of speed I don't blame a guy for slapping at the ball. It was his 53rd infield hit of the season.

There's also been a raging debate on the value of Ichiro's performance on SABR-L the last few weeks. Here are my thoughts in a nutshell:

1. The Mariners performance this season has nothing to do with Ichiro's. He performed about the same in their 116 win season and can certainly not be hurting the club.

2. Ichiro is a very capable leadoff hitter. His OBP is .416, near the top for leadoff hitters in the AL. He'll score 100 runs again for the 4th consecutive season.

3. That said, his total value as a hitter (his defense and base running are superb) is not in the same league with guys like Barry Bonds, Albert Pujols, and Manny Ramirez because he doesn't get any extra base hits (just 36 thus far to go with 211 singles). While his record chase is exciting and is good for baseball it doesn't have inherent offensive value. Ichiro's OPS is .877, about the same as Miguel Tejada, Mike Lowell, Carlos Lee, and Vinny Castilla.

4. If indeed Ichiro has more power than he shows as the rumors go, he should at least experiment with it to see if he couldn't trade 30 points of batting average for 20 homeruns. Not doing so is hurting his team.

Brooms Out in Pittsburgh

Three games down, ten to go for the Cubs who swept their three game set with the Pirates today winning 6-3. The Cubs finished their season series with the Pirates 13-5. Greg Maddux (15-10) won his 15th game of the season which extends his streak going back to 1988. He's now won 304 games in his career, 110 as a Cub. He also topped 200 innings. Except for 2002 when he pitched 199 and 1/3 he's also pitched that many every season since 1988.

In last night's game Sammy Sosa, struggling at the plate (.241/.448/.279) in September saved the game with a nice diving catch with the bases loaded in the bottom of the 8th inning.

The Giants continue to win, beating the Astros again last night so as of right now the Cubs are tied atop the Wild Card. However, the Giants are now only 1/2 game behind the Dodgers for tops in the NL West. The Giants will send Jason Schmidt to the mound against young Brandon Backe so I wouldn't bet the Astros will salvage the final game. The Dodgers go with Kazuhisa Ishii against David Wells. If the Giants win and the Dodgers lose the Cubs will be tied in the Wild Card with the Dodgers at 86-66. If the Giants and Dodgers both win Houston will fall to 4 games back with 70 losses and be almost all but out of it.

At this point its tough to know who to root for. Right now best case scenario is for the Astros to win tonight and the Giants to get swept by the Dodgers over the weekend. However, if the Giants win and the Dodgers lose tonight then I'm rooting for a Giants sweep this weekend. The Dodgers do have the easier schedule of the two because of playing Colorado.

It's going to be a great weekend of baseball.

Upcoming schedules:

9/24 New York (66-86)
9/25 New York
9/26 New York
9/27 Cincinnati (69-82)
9/28 Cincinnati
9/29 Cincinnati
9/30 Cincinnati
10/1 Atlanta (89-63)
10/2 Atlanta
10/3 Atlanta

9/23 Houston (83-69)
9/24 Los Angeles (86-65)
9/25 Los Angeles
9/26 Los Angeles
9/27 Off day
9/28 San Diego (82-70)
9/29 San Diego
9/30 San Diego
10/1 Los Angeles
10/2 Los Angeles
10/3 Los Angeles

9/23 San Diego (82-70)
9/24 San Francisco (86-66)
9/25 San Francisco
9/26 San Francisco
9/27 Colorado (65-85)
9/28 Colorado
9/29 Colorado
9/30 Colorado
10/1 San Francisco (86-66)
10/2 San Francisco
10/3 San Francisco

Tuesday, September 21, 2004

First Pitch Redux

In a previous post I included some data from Dave Smith at Retrosheet about swinging at the first pitch. Here are the highest and lowest percentages for 2003 regular players (defined as players with 300 plate appearances) also provided by Smith.

Nomar Garciaparra 52.6
Vinny Castilla 49.5
Vladimir Guerrero 47.8
Jacques Jones 45.8
Andruw Jones 42.5
Randall Simon 42.5
A.J. Pierzynski 42.2
Craig Wilson 41.6
Austin Kearns 40.5
Brandon Phillips 40.2

Scott Hatteberg 5.8
Jason Kendall 7.5
Todd Zeile 7.6
Mark Ellis 7.7
Johnny Damon 8.4
Craig Counsell 8.5
Bobby Abreu 8.9
Dave Roberts 10.0
Aaron Guiel 10.0
David Eckstein 10.1

Along with some other notables:

Derek Jeter 35.6
Jim Thome 32.1
Albert Pujols 26.3
Alex Rodriguez 25.6
Ichiro Suzuki 24.3
Barry Bonds 23.6
Frank Thomas 15.6

Note that the average for regular players is 26.5% and for all players 27.3%.

One of the interesting questions that came up on SABR-L is whether the parity (both OPS values were .759) between first pitch swingers and first pitch takers simply reveals that each strategy is equally effective and that hitters therefore play to their strengths. In other words, perhaps you can't really say that taking the first pitch is any "better" than not taking the pitch.

Another way of looking at the split data posted earlier suggested by Cyril Morong is to try and calculate how many runs created per 27 outs (((H+BB)*(TB)/(AB+BB))/(AB-H))*27) each represents using the most basic formula. If a typical hitter was given 600 plate appearances you would get:

Swing 0.282 0.302 0.457 0.759 164 17 266 80.6 5.20
NoSwing 0.259 0.346 0.413 0.759 137 72 219 75.6 5.20

So whether you swing or not on the first pitch you end up with exactly the same amount of production, 5.20 RC/27 outs. First pitch swingers actually produce 6.5% more runs but use up 26 more in the process which brings them in line. So this also confirms that it is a wash and that OPS tracks very well with other run creation formulas.

A More Accurate ERA

The January issue of the .NET Developer's Journal will include an editorial by Jon Box where Jon is nice enough to mention my MLB Pocket Manager application as an example of using the .NET Compact Framework. Through Derek Ferguson I discovered that the CEO of Expand Beyond Corporation Ari Kaplan is also interested in baseball statistics.

Ari has done some very interesting work in evaluating pitchers (especially relief pitchers), work that Orioles, Expos, and Padres paid him to do. In short Kaplan devised at least three measures:

  • RE (Reliever's Effectiveness) - the number of runs a pitcher allows in a given situation divided by the number of runs expected. This stat is based on run expectancy tables like the one I used to build my pocket manager. The reason this stat is useful is that some pitchers enter a game more frequently with runners on base and some without. After all, ERA was devised during a time when most starters completed their games as shown in the following graph. The percentage of complete games today is down around 4.5%.

  • PERA (Potential ERA) - what the pitcher's ERA would be if none of the runners on base when he left the game scored
  • WERA (Worst Case ERA) - what a pitcher's ERA would be if all runners he left on base scored

To these I'll add two other independent measures that can be used to evaluate pitchers:

Component ERA (CERA)

Definition: A statistic that estimates what a pitcher's ERA should have been, based on his pitching performance.

PTB in the formula is calculated as:

When intentional walk data is not available you can use:

Also, if the ERC is less than 2.24 the formula is adjusted as follows:

History: The formula here is at it appears in the 2004 edition of The Bill James Handbook.

Expected ERA (XERA)

Description: xERA represents the expected ERA of the pitcher based on a normal distribution of his statistics. It is not influenced by situation-dependent factors. xERA erases the inequity between starters' and relievers' ERAs, eliminating the effect that a pitcher's success or failure has on another pitcher's ERA. Similar to other gauges, the accuracy of this formula changes with the level of competition from one season to the next. The normalizing factor allows us to better approximate a pitcher's actual ERA. This value is usually somewhere around 2.77 and varies by league and year.

XERA = (.575 * H/9 ) + (.94 * HR/9 ) + (.28 * BB/9 ) - (.01 * K/9 ) - Normalizing Factor

History: By Gill and Reeve as found on No formula for calculating the Normalizing Factor is found on the site.

Down to the Wire

Well, the Cubs are right now sitting one-half game out in the Wild Card race with a record of 83-66. The Giants have played one more game and are 84-66. The Astros remain a game back at 83-67. Here are the remaining schedules:

9/21 Pittsburg (68-81)
9/22 Pittsburg
9/23 Pittsburg
9/24 New York (65-85)
9/25 New York
9/26 New York
9/27 Cincinnati (68-81)
9/28 Cincinnati
9/29 Cincinnati
9/30 Cincinnati
10/1 Atlanta (88-62)
10/2 Atlanta
10/3 Atlanta

6/21 Houston (83-67)
6/22 Houston
6/23 Houston
6/24 Los Angeles (86-63)
6/25 Los Angeles
6/26 Los Angeles
6/27 Off day
6/28 San Diego (80-70)
6/29 San Diego
6/30 San Diego
10/1 Los Angeles
10/2 Los Angeles
10/3 Los Angeles

There is no doubt when considering strength of schedule that the Cubs should come out on top. Further to their advantage is that the Giants play the Astros three games this week which means that either the Astros will be nearly out of it or the Giants and Astros will both have about 68 losses. After the Giants the Astros finish with Milwaukee, St. Louis, and Colorado. Either way, the Cubs have their destiny in their hands. Last September the Cubs went 19-8 to come from behind to nudge the Astros. This season they've gone 11-6 but have won 7 of the last 9.

With Mark Prior's nice outing yesterday here's hoping for two and half uninterrupted times through the rotation. Matt Clement has been more than shaky his last 5 outings since his 13 strikeout performance against the Brewers on August 24th. Kerry Wood takes the mound tonight in Pittsburg. He pitched decently against the Reds last time out going 7 innings and giving up 4 earned runs, striking out 9. He'll be facing Josh Fogg, who pitched very well against the Cubs last time out shutting them out in 6+ innings before his bullpen let him down. The Cubs have a history of scoring early on the Pirates this season (they're 10-5 against the Pirates so far) so we'll see if they can jump in front tonight.

Would anyone have believed what Niefi Perez has done? He's still the worst offensive player imaginable but in a small sample he's hitting .373 (19 for 51) with 4 doubles and 2 homeruns and has played a very solid shortstop. Dusty continues to bat him second but it hasn't hurt the Cubs yet.

And of course the Cubs have clinched back-to-back winning seasons for the first time since 1971-1972. This year will also mark the first time since 1997 that Sammy Sosa will not lead the time in OPS. Aramis Ramirez (.950), Moises Alou (.921), and Derek Lee (.896), are all ahead of Sammy at .863.

Sunday, September 19, 2004

The Black Prince of Baseball

In The New Bill James Historical Baseball Abstract James says of Hal Chase, or "Prince Hal" as he was known (1883-1947), the slick-fielding first baseman for the Highlanders (Yankees) (1906-1913), White Sox (1913-1914), the Buffalo franchise of the Federal League (1914-1915), Reds (1916-1918), and Giants (1919):

"Hal Chase is remembered as a shining, leering, pock-marked face, pasted on a pitch-dark soul...The secret of Hal Chase, I believe, was that he was able to reach out and embrace that evil...This is not the corrupted. This is the corrupt. No matter what his skills I would not want Hal Chase around, period, and I find it extremely difficult to believe he that he ever helped any team, at all, period."

Although James also writes that he wouldn't choose Chase among a thousand players he does rate him as number 76 in his list of 100 first baseman in his book. Further, James sides with the opinion given by long-time writer Fred Leib, that "the whole thing" - meaning the Black Sox and other scandals - all started with Chase and that had he not been a ballplayer corruption would not have entered the game that eventually culminated in the dissolution of The National Commission and the installment of Judge Landis as Baseball Commissioner.

The authors of The Black Prince of Baseball covering Chase's life and career start with James' appraisal and use it as the springboard to paint a subtler picture of Chase. Their thesis is that Chase most certainly did not enter an organized baseball world that was free from corruption, but rather that Chase simply took advantage of a system already in place. For the authors his major crime was not his indifferent play and association with gamblers, after all, everyone betted on games and fixes were a regular occurrence. No, "the fixing charges had largely been in settlement of other scores" and that his "mistake" was

"in first challenging, then seeming to mock Organized Baseball's authority. Long before taking on Comiskey, Chase had shown little grasp of the reality that when Al Spalding had proclaimed baseball the culturally pure national pastime, it had followed logically that the National Commission and the 16 owners it represented were the Stars and Stripes. One didn't just come and go, only for the chance to make more money than was available elsewhere."

In other words, just as Chase's sister Jessie said after his death in 1947, Chase was the scapegoat. These "other scores" included Chase's jumping from the Highlanders back to the California State League in 1908 and from Charles Comiskey's White Sox in 1914 to the Federal League Buffalo franchise where he won a court battle with the American League and its President Ban Johnson.

The book doesn't go so far as to completely exonerate Chase, in fact far from it. For example, as the authors document there is little doubt truth to the charge that Chase did fix games, partaker in 1917 and 1918 (there is however plenty of doubt that he fixed games as early as 1913 when with the Yankees as is widely believed). In 1918 Chase, feeling pinched for money because of the possibility of the baseball season being cut short due to the war (and his need for money to support his gambling and womanizing habits), and shortstop Lee Magee bet on the first game of a July 25th doubleheader with Boston. Magee wrote Chase a $500 check to cover the bet Chase then placed with a gambler. Magee later testified that he didn't know that Chase had bet against the Reds (it was common for players to bet on their own teams to pocket a little extra cash) although suspiciously Magee made two errors, one that sent the game into extra innings and the other that kept the Braves alive in the final inning. In an interesting irony, the Reds won when Magee couldn't help but score after reaching on an error and being driven in by Edd Roush. Chase also approached a visiting player, the New York Giants Pol Perritt, about throwing the July 17th game. When all of this came to light Chase was suspended for the remainder of the season by manager Christy Mathewson.

By including all of the gory details, the authors contention is not that Chase was pure but that much that has been passed down through baseball writers and historians, particulars Lieb's contention that Chase started it all and that Chase was the ring leader of the Black Sox scandal, is simply not true. This belief in Chase as the spring of all evil reminds me of a particular form of historical revisionism that seeks to find a single cause for complex events, which Stephen Jay Gould in his essay "Jim Bowie's Letter and Bill Buckner's Legs" identified as the "but for this" canonical story.

To back up their thesis the authors devote a chapter to documenting the corruption caused by gambling in organized baseball reaching back to 1857, 19 years before the National League was formed through 1904, the year before Chase joined the Highlanders. They also devote much of another chapter to the long list of gambling related scandals across both leagues from 1905 to 1917 that implicate among others Giants long-time manager John McGraw. They also show how Chase could not have played much of a role in the Black Sox scandal as he was on a barnstorming tour with the Giants during the time that most of the planning took place. Chase was summoned by the grand jury in Chicago but refused to appear without remuneration or forced extradition, neither of which ever were pursued by the authorities in Chicago. While its clear that Chase played a small role and attended at least one initial meeting with gamblers, the groundwork for the scandal was laid before Chase was involved and its culmination took place after Chase had left the scene.

One of the most interesting revelations in the book is that John McGraw committed perjury before the grand jury investigating the Black Sox scandal. During his testimony McGraw said that he did not offer a viable contract for the 1920 season to Chase in part because he was suspicious of Chase hanging around with the ex-ballplayer Bill Burns, the principal organizer of the scandal. The contract offered to Chase for the 1920 season in the amount of $1,083.33 per month is shown as the only figure in the book. Chase remained in good standing with the Giants and could have played the 1920 season but instead chose to remain in his native California, finally deeming he'd had enough of baseball in the east or perhaps because he knew his crooked play wouldn't get him through another season as it barely did in 1918. The authors seem to pinpoint this event coupled with the hatred of Chase by Ban Johnson as a turning point in the standard history of Chase.

One of the other aspects of the book that I found interesting is its portrayal of deadball era baseball. The authors do a great job of portraying a feel for the times using newspaper accounts and that includes the free-flow of players between leagues, the relative parity of other leagues with the American and National Leagues, and the way in which ballplayers would play on semi-pro teams on Sundays and barnstorming teams during the winter to make extra cash. Of course Chase was the quintessential player in this regard and really never developed any skill other than playing baseball and so always looked for opportunities to play for cash, including sometimes being under contract to two teams at the same time and making excuses to one and then the other as he hopped back and forth.

For all of the meticulous baseball detail in the book the authors also follow the off field events of Chase's life including his two marriages which both ended in divorce, his failure as a father, his incessant womanizing, his compulsive gambling, and finally his fall into alcoholism that consumed his final years. Although Chase was not the pure evil of the canonical tale, you'll find little to like in his career both on and off the diamond. In the end he comes across as simply an utterly selfish man to whom satisfying his pleasures came first, last, and always. What you end up with is a pathetic picture epitomized in this vignette from the mid 1930s as told by Tuscon realtor Roy Drachman when as an alcoholic Chase was working odd jobs in Arizona order to survive.

"There was this one time he came by the Opera House and asked if I could give him a dollar. I was getting kind of mad about always giving him money, so I said no, I had only fifteen cents. Anyway, we get talking about baseball for awhile. He kept up with what was going on in the major leagues. Never struck me as very embarrassed, either, about talking about McGraw or any of the people he had played with. But so much of it was really just a lead-in for asking the impressed kid for money. This day I mean, for instance, after all the talk about this one and that one, he suddenly looks at me a says, 'Well, I could use that fifteen cents you mentioned. Maybe buy a loaf of bread or something.'"

Some of the most interesting parts of the book detail Chase's post major league playing days in outlaw leagues of the southwest from the early 1920s until he returned to California to attend his father's funeral in 1934. Particularly entertaining is the account of how Chase won $2,000 from billiards champion Willie Hoppe in 1926 as told by Hoppe and how Chase, after winning Hoppe's ivory cue then gave it back with the advice "Don't get attached to people or things." There is also a somewhat convoluted story of Chase's involvement in the disappearance of evangelist Aimee Semple McPherson and a possible abortion/blackmail caper.

But perhaps most entertaining is the account of the 1925-26 seasons in Douglas Arizona when Chase at 42 years old was playing firstbase for the local team. After the team got off to a poor start in 1925 Chase went to California to recruit more players and returned with Black Sox firstbaseman Chick Gandil and third baseman Buck Weaver. Weaver then recruited pitcher Lefty Williams for the 1926 campaign. Their third baseman, Cowboy Ruiz, tells this story of one memorable game in 1926 when Negro League star and Hall of Famer Bullet Joe Rogan was pitching for the Fort Branyard team against Douglas. After eight and a half innings Fort Branyard is winning 1-0 with two outs and nobody on and Buck Weaver is up.

"I hear Buck say to the Prince, 'I'm gonna lay one down, Hal. If it works, I'm gonna go on the first pitch.' Sure enough, Weaver drag bunts past the pitcher and beats the throw. Up walks Hal. There must have been half of Douglas at the game and they are hollering and screaming. They were hanging from the rafters. We were thinking, 'Too bad the Prince is old now. I bet in the old days he could hit this guy.' Well, anyway, Buck goes on stirke one to the Prince. Rogan doesn't pay any mind to him because he's after Hal. I'm standing next to Lefty Williams, who usually was very quiet. But all of a sudden, Lefty starts screaming at Hal, 'Get his ass Prince! The son of a bitch got only pitch! He can't spin it to save his ass!' Rogan glares over at Lefty and then throws a curve to Hal, and the old boy hit that pitch, which was around his eyes, at least 100 feet over the left field fence.'

'I never saw anything like it. The crowd ran on the field. buck was jumping in the air. Hal is circling the bases in this real slow trot. He has this kind of half-smile on his face, waving to the crowd in a real kind of slow wave...He was waving to the people."

Friday, September 17, 2004

Swing Away

When a batter swings at the first pitch does that reduce or enhance his odds of getting a hit? This week on SABR-L David Smith of Retrosheet posted some hard numbers to try and answer this question that originated from a college baseball coach. The data were further analyzed by Bruce Cowgill and break down as follows:

           BA     OA    SA   OPS

Swing 0.282 0.302 0.457 0.759
NoSwing 0.259 0.346 0.413 0.759
Total 0.266 0.334 0.426 0.760

This includes data from 2003 and 2004 up to September 14th comprising 345,905 plate appearances. It's interesting that the OPS numbers are almost identical between the two groups but as you might expect the OA goes up when a batter doesn't swing on the first pitch although the batting and slugging averages go up. In addition, hitters swing at the first pitch 27% of the time and take it 73% of the time.

In another post Cyril Morong found that for this season OPS correlated with run scoring to the tune of .973. Squaring this value gives .947 which means that 94.7% of the variation in team runs was explained by OPS. This is another indicator of why OPS is so valuable as a quick means of assessing a player's offensive value. Morong also reported that the authors of Curve Ball found a correlation of "just" .914 making the r-squared .835. Reasons that OPS may correlate better this year than in the past include luck, increased homeruns and therefore decreased stolen bases and one-run strategies, or an increase in strikeouts. Time will tell whether higher correlations between run scoring and OPS is a trend or not.

Thursday, September 16, 2004

More Unearned Runs

Here's an interesting post about the debate over whether unearned runs should be tracked. Many of the commenters make the typical points in this discussion.

Tuesday, September 14, 2004

More Royals Notes 9/14/04

Here are a few more notes as the Royals take on the Yankees tonight at Kauffman Stadium...

Royals who are hot include:

  • Joe Randa who is 77 for 231 (.333) with 15 doubles, a triple, and 5 homeruns since June 6
  • Angel Berroa who is 23 for 47 (.489) in September
  • The Royals are averaging 8.9 rpg in their last 9 games but only winning 5 of the 9
  • To avoid 100 losses the Royals will need to go 11-8 or better in their last 19 games
  • Even with 26 runs in his last start Zack Greinke's run support is still only 4.6 runs per game (92 runs in 20 starts). Greinke has 10 quality starts and the Royals are 9-11 in games he starts

The Numbers Game

Alan Schwarz in 270 tightly packed pages provides the first-ever history of baseball statistics from its beginnings with Henry Chadwick and the invention of the boxscore through Bill James and the information age and beyond.

From the book jacket: "Most baseball fans, players, and even team executives assume that the national pastime's infatuation with statistics is simply a by-product of the information age, a phenomenon that blossomed only after the arrival of Bill James and computers in the 1980s. They couldn't be more wrong."

Simply put, this is the story of the men who have created, tracked, innovated, and had the biggest impact on baseball statistics. To that end the book is more a series of short and sometimes connected biographies than it is about statistical analysis. Schwarz, a writer for Baseball America, tells his story using some statistics but tries his best to shield the reader from too many formulas.

The story starts, appropriately enough, with a cricket reporter in New York named Henry Chadwick, who got hooked on "base ball" and created the first box score or "abstract" in The New York Morning News on October 22, 1845. From that begining Schwarz traces the history of the statistics and their analysis. The key players in his story are:

  • Chadwick of course
  • Ernie Lanigan, a reporter for The Sporting News and the New York Press
  • John Heydler, secretary and later president of the National League
  • F.C. Lane, editor of Baseball Magazine from 1912-1937
  • Al Munro Elias and his brother Walter, founders of the Elias Sports Bureau in the 1930s
  • Alan Roth, the stat guy behind Branch Rickey (1950s)
  • Sy Berger, who pioneered putting statistics on baseball cards (1950s)
  • Hal Richman, the inventor of Strat-O-Matic (1950s)
  • Earnshaw Cook, author of Percentage Baseball (1964)
  • George Lindsey, author of statistical studies using play-by-play data (last 1950s-early 1960s)
  • Harlon and Eldon Mills, authors of the technique called Player Win Averages (late 1960s-early 1970s)
  • David Neft, one of the key figures in the creation of The Baseball Encyclopedia (late 1960s)
  • Bill James (1970s-80s)
  • Seymour Siwoff, the head of Elias
  • Steve Mann and Dick Cramer who started STATS, Inc.
  • Dan Evans, who ran the Edge 1.000 program for the White Sox
  • John Dewan of Project Scoresheet and later of STATS, Inc.
  • Pete Palmer inventor of Linear Weights and creator of Total Baseball
  • Cal Morris and Stephan Jay Gould, academics interested in baseball statistics
  • Voros McCraken and the development of DIPS
  • Eric Walker, author of The Sinister First Baseman and influencer of Sandy Alderson (early 1980s) on the importance of on base percentage
  • Craig Wright, one time sabermetrician of the Texas Rangers
  • Ron Antinoja of Tendu (2000s)
  • Dave Smith of Retrosheet

Along the way Schwarz tells some interesting stories, particularly the massive amount of work that went into the development of The Baseball Encyclopedia and the ripple it caused through the game's most sacred numbers. Also interesting are the stories behind the stormy relationship of Semour Swioff and Bill James and the contentious history of Project Scoresheet and STATS, Inc.

I loved this excerpt from George Will's review of the book for the New York Times.

"Someday baseball statistics may be so sophisticated that they will be what James Joyce said his work was, something we should devote our lives to mastering. But if human beings have, as Schwarz believes, a ''compulsion to count, to quantify the world around them,'' then they are hard-wired to be baseball fans.

If so, that fact lifts a load of guilt off this Puritan nation's shoulders. All those hours -- years, actually -- we have spent watching games when we should have been reading ''Finnegans Wake''? Not our fault. Nature has made us do it. Which means that baseball is, as we chauvinists of the sport have long suspected, not merely the national pastime but the species' pastime. So there. "

No sabermetric library should be without a copy.

Royals and the Bronx Bombers

The Royals, for the second time in a week, got hot with the bats and scored 10 runs in the 5th inning to beat the Yankees 17-8 last night. Of course, the Yankees provided some help. In that 5th inning there were:

  • 6 singles, 2 of them bunts that the Yankees might have made plays on
  • 1 homerun (John Buck)
  • 5 walks (one intentional)
  • 2 wild pitches
  • a balk

Not exactly a hitting show but one the Royals will take this season. Angel Berroa went 5 for 5 raising his average to .272 and Joe Randa went 2 for 5 inching him closer to .300 at .294. I'm glad I wasn't scoring the game last night. Chris George pitched the 8th and 9th and promptly gave up 5 runs. His line now reads 35.2 IP, 54 H, 23 BB, 12 SO, and an 8.07 ERA. Yikes. At least he hasn't given up a homerun this year.

Left hander Brad Halsey pitched for the Yankees who has struggled to say the least (over a 7 ERA in limited innings) although Tanyon Sturtze was charged with 7 earned runs in the inning.

Zack Greinke goes for the Royals tonight against Mike Mussina in his first career appearance against the Yankees. I'll be scoring the game and it'll be interesting to watch how Zack approaches their lineup. Greinke's been an extreme fly-ball pitcher this season at .75 (231st in the league, Jimmy Gobble is 234th, two of the bottom five starters in the AL with Darrell May not far behind) and has given up 25 homeruns in 120.3 innings, one less than every 5 innings.

And this note from SABR-L: Texas leads all teams this year by using the DL 26 times. The Royals, with the most roster moves so far and through the use of 57 players (a team record, the ML record is 59 set by Cleveland and San Diego in 2002), has used the DL 22 times.

Sunday, September 12, 2004

Changing Intentional Walks

Great article here on The Athletic Reporter on the silliness of banning or changing the rule regarding the intentional walk (something Bill James has advocated). I particularly like his closing:

"If you want to talk about rule changes relating to Barry Bonds, one more in accordance with baseball history would be banning giant elbow pads the size of beach balls. Bonds is able to lean out over the plate to a laughable degree, and any pitcher who dares to throw one two inches off the inside of the plate (i.e., at him) is subject to reprimand and ejection. This notion that pitching inside is not allowed, and that giant elbow pads the size of beach balls are, has done as much if not more than steroids to undermine the basic competitive nature of baseball. When held up next to steroids and giants elbow pads the size of beach balls, intentional walks don't even register on the "Ways Baseball Is Worse Because of Barry Bonds" scale.

So stop this 'banning the intentional walk' nonsense. I don't want to here anymore about it."

Couldn't agree more.

For those interested in the underlying question as to whether teams should pitch to Barry Bonds or not, a study was published in the November 2002 issue of By the Numbers, the newsletter of the SABR statistical committee. The study by Jerome P. Reiter concluded from 2001-2002 data that "there is little difference in opposing team's ability to prevent runs when walking Bonds versus when letting him hit. In fact, the data suggests that it may be better to pitch to Bonds than to walk him in certain game situations." For example, the only two situations where the Giants actually scored less frequently when walking Bonds was with none on and one out and none on and two outs.

The authors of Curve Ball also did an analysis of this issue in chapter 9 using 2002 play-by-play data and found that you should walk Barry a runner on second and two outs and runners on second and third with two outs. In all other situations you either definitely pitch to him or it was too close to call. It's interesting to note that 5 of the 7 situations in which it was too close to call were with 2 outs, the other 2 with 1 out. Never walk Barry with nobody out.

By the way, Barry hit his 699th homerun of his career tonight and now has 203 walks, 104 of which are intentional.

Gobble Again

After Jimmy Gobble threw his complete game against the Twins last weekend there was talk that perhaps things had changed for Gobble while down at Omaha. Evidence included his use of a cut fastball to coax 20 groundball outs versus only 3 fly ball outs, a major reversal from his usual profile as a huge fly-ball pitcher. In his start yesterday he reverted to form with only 3 groundball outs against 8 fly balls in his 5 innings in which he gave up 5 earned runs and surrended a homerun.

This afternoon Denny Bautisa was hammered again, this time giving up 11 hits and 6 runs in 5 and 2/3 innings. His ERA is now 11.85, not exactly what the Royals had hoped from a guy they project as in the rotation next season.

The Cubs are down 7-1 in the 8th. Glendon Rusch was anything but sharp and Moises Alou contributed with an early throwing error. The Cubs managed only 1 walk against Burnett, a guy not known for his control.

9 Pitches and Out

From Al Yellon, a fellow SABR member, I learned that LaTroy Hawkins became the thirty-seventh pitcher to strike out the side on nine pitches when he turned the trick yesterday in the 9th inning of the Cubs 5-2 victory over the Marlins. He is the third Cub to accomplish this feat with Milt Pappas on 9/24/71 and Bruce Sutter on 9/8/77 doing so before him. He is the first Cubs to record a save while doing so. Also from Al...

"I checked to see if he might have become the first pitcher on any team to do so, and I found the following information. Of the now 37 (including Hawkins) to accomplish this feat, seven (including Hawkins) did it in the 9th inning (or later), thus possibly qualifying for a save.
Here are the other six:

  • Hod Eller, Cincinnati, 8/21/17, 9th inning. No box available from Retrosheet. Eller did have one save in 1917 and his team won that day, so it's unclear whether he recorded a save.
  • Hollis Thurston, Chicago AL, 8/22/23, 12th inning. His team lost.
  • Jim Bunning, Detroit, 8/2/59, 9th inning. His team lost.
  • Trevor Wilson, San Francisco, 6/7/92, 9th inning. He was the starter and winner.
  • Mel Rojas, Montreal, 5/11/94. He pitched the 8th and the 9th, and struck out the side on nine pitches in the 9th, and did record a save.
  • Mike Magnante, Houston, 8/22/97, 9th inning. His team lost.

So, LaTroy Hawkins is the second confirmed pitcher to strike out the side on nine pitched balls and record a save while doing so."

The win moved the Cubs (76-63) into a tie in the wild card race with San Francisco (78-65) although the Cubs have lost 2 fewer games and so have the advantage. The loss drops the Marlins back 2.5 games. Today the Cubs are going with Glendon Rusch since Matt Clement is struggling with shoulder and neck stiffness. He's 4-1 as a starter with a 3.72 ERA. The Marlins will go with A.J. Burnett, 6-6/3.93. Nomar Garciapara is day to day as well. On the bright side Sammy Sosa was dropped to the 6th spot in the order and after striking out his first two at bats looked good his last two times up getting the key single in the 8th inning. It was the first time Sammy had hit 6th in 10 years (6/10/1994). Here was the Cubs lineup courtesy of Retrosheet on that day.

Tuffy Rhodes CF
Ryne Sandberg 2B
Mark Grace 1B
Derrick May LF
Rick Wilkins C
Sammy Sosa RF
Steve Buchele 3B
Shawon Dunston SS
Steve Trachsel P

Only Sosa and Trachsel are still in the majors. The Cubs lost 2-1 to Greg Gross and the Dodgers.

Saturday, September 11, 2004

The Quiet Revolution?

With the publication of Moneyball and The Numbers Game sabermetric knowledge (see this post for some of the core conclusions) and research methods have begun to spread to a broader audience than the relatively small number of baseball analysts and stat geeks that have long championed them. Nate Silver in his column on Baseball Prospectus discusses the present and future of sabermetrics and notes the increasing visibility of sabermetrics in front offices and the media. However, Silver also provides the following return to reality:

“Does that mean that the battle has been won? Hardly; it is likely to take a while, perhaps a whole generation, before analysis crosses the chasm between the early adopters and the mainstream. Baseball executives, with some notable exceptions, are older men with backgrounds in scouting and player development. Beat writers, by and large, are a worn lot and neither particularly well-versed in analysis nor particularly interested in learning about it. While the Internet and other forms of new media provide fans with greater discretion in just how they take their baseball, there is also an increasing tendency within the media (this means you, ESPN) to take a Paparazzi-like approach toward their sports coverage…It will be a quiet revolution, and those tend to be slower than the bloody ones.”

Silver goes on to predict that sabermetrics will emerge into an “analytical post-modernism” characterized by the acknowledgment that some questions are out of reach and the realization of the limits of analysis. He hopes that a second trend in post-modernism, what he calls “navel-gazing” or the tendency for individuals to be more concerned with their own position and reputation in the movement can be avoided. He also warns that:

“Movements, whether social, political or intellectual, tend to be unified when they have a lot of work to do and when there is a lot to accomplish, and fractionalized when there is not. If sabermetric thinkers come to believe prematurely that their mission has been accomplished, the infighting is likely to increase, and the movement could set its progress back.”

To the question of what remains, Silver points readers to Keith Woolner’s Baseball Prospectus 2000 article “Baseball’s Hilbert Problems” where, like the mathematician David Hilbert in 1900, he lays out 23 major areas of study for the coming generations. Even so, Silver concludes that:

“It is also probably true that the pace of discovery within sabermetric circles will slow as more and more data is analyzed and more and more conclusions have been proclaimed. Baseball, while a wonderfully complex game, is nevertheless a closed system, and the returns on further research efforts are likely to diminish.”

Great stuff.

I agree with Silver that sabermetric knowledge is making its way into the front offices and media outlets not as the result of the success of risk taking early adopters such as Sandy Alderson (who discovered the importance of on base percentage through the work of Eric Walker in The Sinister First Baseman) but primarily as a result of the old guard moving on and the younger generation (Epstein, DePodesta, et. al.) raised on The Baseball Abstract and The Hidden Game of Baseball, getting an opportunity to apply their theories. This view is akin to how scientific revolutions finally take hold per The Structure of Scientific Revolutions by Thomas Kuhn where Kuhn quotes Max Planck's observation that:

"a new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generations grows up that is familiar with it."

And what this will do in the long run is kill the movement in its current incarnation as a collection of outsiders making the very term "sabermetrics" passé, as it becomes the new orthodoxy.

As far as Silver’s view that sabermetrics in the near term (at least I think he meant the near term) may be emerging into an analytical post-modernism, I’m not sure I agree with that characterization. I would hesitate to call it post-modernism because to me PM’s main tenet is relativism, something sabermetrics certainly eschews with its belief that there are definitive answers to questions if given the correct data. I do, however, agree that as sabermetrics matures it will (whether it expects to or not) collide with areas that may be difficult to analyze using a reductionist approach. Areas that, as Stephen Jay Gould said of human attributes such as religion and “moral sense” are the product of emergence (the whole is not the sum of its parts).

Silver’s final point above is a sentiment that was expressed in another analysts’ blog I recently read who lamented that the pace of sabermetric breakthroughs seems to have slowed to a crawl in recent years. One commenter to the post shared that sabermetrics, because it was finally moving beyond its small circle into a wider world, was emerging into a period of consolidation, explication, and refinement of its main research conclusions over the past quarter century. I think both Silver’s and the commenters points are correct – to an extent.

Baseball largely is what it is (its inherent structure - 27 outs, 9 fielders, 90' and 60' 6" - will remain the same) and so analysis of the game can only proceed along a finite number of lines forcing diminishing returns as research progresses. Much of the low hanging fruit has probably been picked. For example, the conclusion that the avoidance of outs and therefore OBP is important and the relative unimportance of batting average is not likely to be reversed.

Coupled with sabermetrics’ increased visibility this means that job one of the community is now to preach the message to the uninitiated and solidify core conclusions to help sabermetrics attain the status of the new orthodoxy. To illustrate, consider just one example of the lack of awareness in the larger baseball community - the basic conclusions of play-by-play data preached in The Hidden Game of Baseball and encapsulated in a tool like my MLB Pocket Manager calculator. The basic probabilities and the magnitude of the trade-offs involved are currently beyond the view of the average professional manager or ballplayer, let alone fan.

At the same time baseball, like all things, evolves. In that sense I don’t think that baseball is a totally closed system and so there will always be new questions to answer and new angles to research (more refined and perhaps of smaller scope but still there none the less). Those new angles will present themselves because of the fact that styles of play (and rules to a lesser extent) and strategies change over time. And those new angles will not only encompass new ways of looking at existing data such as DIPS and play-by-play but actually new data. One of the most exciting parts of The Numbers Game is the future look at how will be capturing new categories of information related to both offense and defense. And you can bet that the sabermetric community will be there to analyze it.

What this discussion brings me to realize is that the sabermetric community needs to do two things going forward:

1) Solidify its core conclusions while making them presentable to a wider audience
2) Strive to address issues such as those raised by Keith Woolner in addition to those raised by new data collection techniques

In order to do both of these I think what the current community needs is a clearinghouse of sorts where studies, research, techniques, and discussions can take place in an open environment. Being a software developer my thoughts immediately turn to a community web site. So my idea is to create a site analogous to the 123ASPX site created by a friend of mine for collecting resources on Microsoft ASP.NET Development. To that end I’m currently working on an initial design for such a site that includes:

  • Links to sabermetric studies on the web
  • Links to research sites where researchers can find statistics
  • Links to sabermetrically related sites
  • Links to sabermetrics in the news
  • Original articles
  • A glossary of statistical formulas
  • Book reviews
  • Links to sabermetrically related blogs and possible inclusion of the RSS feeds
  • Discussion boards
  • A schedule of events where sabermetrics is discussed

The site would be open to contributions from registered users and would be moderated. I’d love to hear what you think, including a name for the site.

Friday, September 10, 2004

All or Nothing

The Cubs are about to split a double-header with the Marlins. They got shutout again for the second game in a row in the opener 7-0 by Carl Pavano. In the second game they're winning 11-2 in the 9th. The Cubs lineup starting the second game was not very encouraging for a team that had just been shutout for 18 innings:

Patterson CF
Perez (yes, Niefi) SS
Alou LF
Ramirez 3B
Lee 1B
Grudzielanek 2B
Macias RF
Bako C

Ouch. Why not Walker? Why not Dubois? Why not Sosa? Why not Garciapara? I know Sosa and Nomar are beat up but with only 25 games to go? So what happens? With homers by Alou and Ramirez and a 4 for 4 from Perez the Cubs rolled against a rookie making his major league debut for the Marlins. I can't figure it out.

Although behind in the wild card race now the Cubs are still tied with the Marlins where it counts - the fewest losses at 63. Looking to take the final two games this weekend although it appears that Matt Clement will not start on Sunday against A.J. Burnett.

On another note, the Cubs are now carrying Tom Goodwin, Calvin Murray, Niefi Perez, and Paul Bako. A weaker collection of bench hitters I've not seen.

Thursday, September 09, 2004

Is Batting Average Luck?

One of the core sabermetric conclusions of the last 25 years is that batting average, officially adopted in 1876, is overrated. To reinforce that conclusions I today read Jim Albert’s study, A Batting Average: Does it Represent Ability or Luck?. In the study he was trying to determine what statistical categories could most be attributed to true hitter’s abilities and which were more the result of luck. He did this using both an empirical analysis by computing the season to season correlation in a number of areas and using the random effects model.

The statistics he measured included:

  • SO rate = SO / PA
  • IP HR rate = HR / Balls put in play (BIP)
  • BB rate = BB / PA
  • OBP
  • IP AVG = H / BIP
  • AVG
  • IP 2+3 AVG = (2B+3B) / BIP
  • IP S rate = Singles / BIP

What he found was that when ranked in order from more ability to more luck the statistics fell out into the order shown above. In other words a player’s strikeout, homerun, and walk rates are all more predictable and thus better indications of a hitter’s true ability than is batting average, which has a larger component of luck. It should be noted that his study looked at players who batted more than 100 times in the 2002 and 2003 seasons.

One of the things that immediately comes to mind is that perhaps the range of batting averages is smaller than the range of walk or strikeout rates because players who reach the majors have been pre-selected to fall within a restricted range. In other words, batting average may only appear as if it is less controlled by ability since the range of averages is smaller. I don't pretend to know enough statistics in order to answer that question.

The other interesting part of his study are the graphs he showed of these rates for Tony Gwynn and Barry Bonds over their careers. What is obvious from the graphs is that Barry Bonds’ career was following a fairly typical trajectory where he peaked around age 27-28 (1994-95) and began to decline until 1999 when suddenly his AVG increased, strikeout rate adeclined, walk rate increased, HR / BIP rate increased, singles / BIP rate increased, and doubles and triples / BIP increased dramatically and continue on those paths. These serve to highlight how extraordinary this last phase of Bonds’ career has been. Here's a quick graph I created that shows the trends.

Run Happy

The strange season for the Royals continued today in Detroit. In the first game of the twin bill they beat the Tigers 26-5.

1 2 3 4 5 6 7 8 9 R H E
Royals 4 2 11 2 0 5 0 1 1 26 26 0
Tigers 0 2 0 1 0 0 1 0 1 5 10 2

In the process the following records were set:

  • Royals most runs game
  • Royals most runs doubleheader
  • Tied AL record with 13 straight batters reaching base in the 3rd inning
  • Joe Randa ties AL record with 6 hits raising his season average to .296
  • Joe Randa ties ML record with 6 runs

Of course, they were then shut out in the second game 8-0 by Jeremy Bonderman. To give you a feel for unusual any kind of offensive outburst is, from June 30th through July 10th the Royals scored a total of 23 runs in a span of 10 games.

Wednesday, September 08, 2004

Neyer and Amazon

Here's an interesting story from Rob Neyer on and the complicated world of Amazon reviews. I think he'd be suprised at how many reviews out there are from relatives, friends, etc. Having written five computer books I can attest that sometimes the bad reviews are unrelated to the content of the book and instead reflect a bias against the company you work for or the technology itself. Amazon has in the past taken down some of those reviews when asked but has left others up without explanation.

In Neyer's case it sounds like he was trying to do the right thing. Now he knows better.

Cubs Complaints and Royals Ramblings

The Cubs last week made two additions:

  • Mike DiFelice. The quintessential backup catcher who played with the Royals last season hitting .254/.397/.299. I like this move in general since it does give some additional depth in case of injury through September. In particular it would be nice to have gotten a catcher who could actually play.
  • Ben Grieve. I love this pickup. Sammy Sosa is ailing and Grieve provides a much needed left handed bat off the bench. He was hitting decently in Milwaukee and still shows some ability to control the strike zone (39 walks in 273 plate appearances) while hitting for some power. I noticed Jose Macias in right field last night (and batting second, grrrr) so I assume Grieve is still a little banged up from his collision with the right field wall on Monday afternoon.

In other Cubs notes Matt Clement once again left last night's loss in the early innings, not a good sign. In the game last night with the Cubs trailing 2-0 Corey Patterson led off the 3rd with a single and promptly stole second base. Baker then had Macias bunt him to third. Nomar Garciapara walked, Moises Alou struck out, and Aramis Ramirez flew out to end the inning leaving the Cubs behind by 2. In that situation (before the sacrifice) the run potential was 1.189 with a 63.2% chance of scoring. Even in order to increase the odds of scoring a single run Macias's sacrifice attempt would have had to have been successful 93.87% of the time. He was indeed successful increasing the odds of scoring a single run minutely to 66.2% but cutting the legs out of a big inning decreasing the run-potential to under a run at .983, down 17%. Given that the Cubs are a one-dimensional team built on power their best strategy is always to wait and see if someone will hit a homerun. Dusty's insistence on playing small-ball seems uniquely ill-suited to this team. The Cubs lost in extra innings 7-6.

The Royals starting this weekend have gone to a six-man rotation for the remainder of the year that will include Zack Greinke, Jimmy Gobble, Brian Anderson, Darrell May, Mike Wood and Danny Bautista. Brian Anderson and Mike Wood have really struggled in their last several starts. I still think that the Royals should go with a four-man rotation next year of (assuming they don't pick up a starter) Greinke, Bautista, May and Wood or Gobble with Anderson as the long reliever/spot starter.

Gobble threw a complete game over the weekend against the Twins, throwing 123 pitches. Although Rany thinks and I agree that this is not generally a good idea for a 23-year old, I think getting the complete game adds a bit to his perceived value which could be helpful since the Royals should be looking to trade him. He did get 20 ground ball outs in the game versus 3 fly balls using a newly developed cut fastball. He's an extreme fly-ball pitcher (.76 GO/AO ratio) normally so some are speculating that this may signal that he's a different pitcher. I hope so but I'm not holding my breath.

There's also an interesting discussion on Rob Neyer's sight between Bill James and Rob discussing Zack Greinke. The short of it is that Bill doesn't "see greatness anywhere there" while Rob is optimistic and thinks that perhaps there is. After watching Greinke throw a number of times I think he'll definitely get better but what he doesn't have is a pitch that just flat out hitters can't hit. I think that's what James was getting at. Greinke doesn't have a moving fastball, a disappearing slider, or an unhittable changeup. He has to locate his fairly routine repetoire - something he does very well especially at 20 - in order to be successful. He tops out at 95 without much movement and is usually around 91-92 with his fastball. His most interesting pitch is his ultra slow curve ball thrown in the 60s and 70s but even that is more of a change of pace and not of the Bert Blyleven/Barry Zito variety.

Other notes:

  • After the two homeruns by Dee Brown last night I'm really afraid that the Royals will again start to think he'll be a player. He won't.
  • Listening to the rain delay coverage tonight it seems Freddy Patek is certainly down on Abraham Nunez and doesn't think he has what it takes to be the regular right fielder next year. He hasn't shown much yet (.265/.394/.317) but of the outfielders that are on the team he's the one I'd give the at bats to the rest of the way and then see if he can earn the job in spring training.

Saturday, September 04, 2004

Homerun Distribution Amended

I recently posted some 1992 retrosheet data but unfortunately, one of the tables was incorrect. The distribution of homeruns by field was skewed since I forgot that homeruns to left center are scored with a "78" while those to right center are "89". My original numbers then made it seem that right handed batters pulled many more balls than left handed batters. Here are the amended numbers:

L Pct R Pct
LF 40 4% 1076 56%
LF-CF 41 4% 498 26%
CF 117 10% 191 10%
CF-RF 331 30% 84 4%
RF 592 53% 60 3%
1121 1909

In addition I found the following percentages of line drives versus fly balls.

Fly balls: 2528 84%
Line Drives: 495 16%
Ground Balls: 2 (wrong codes I assume)
Uncoded: 13

And while we're on corrections...I now understand that researcher David Stephen has discovered that Dale Mitchell actually had 59 hits in 154 AB in August of 1948 eclipsing the mark of 58 by Jeff Heath in August of 1938.

Thursday, September 02, 2004

SABR Comes Through

I found this on the SABR list regarding Ichiro Suzuki and his 56 hit month of August:

A researcher named David Stephan, who is affiliated with Retrosheet (, is believed to be the leading authority 50-hit months.he claims that:

* Ichiro is the first-player ever with three 50-hit months in a season.

* 7th time that a player has had consecutive 50-hit months: Joe Medwick, 1936; Lou Gehrig 1930; Bill Terry, 1929 & 1930; Rogers Hornsby 1924; Cobb, 1917

* The record for most hits in a month is 67, done twice by Ty Cobb and once by Tris Speaker. 13 players have had 60+ hits in a month.

* The record for most 50-hit months in a career is 10, by George Sisler. 2nd-most is 7 by Heinie Manush.

* There have been 215 50-hit months all-time: never in April, 11 in May, 21 in June, 92 in July, 68 in August, 23 in September.

* Ichiro's 56 hit month was the most in the bigs since Jeff Heath had 58 in August 1938.

Trent McCotter also contributed to this research. You can always count on a SABR member to get to the bottom of an issue.

Wednesday, September 01, 2004

Royals Nightly

Was just alerted to this blog by Ron Hostetter. Some nice daily tracking of the Royals season from...I especially liked his point yesterday on how the poor performance in every aspect of the game is the central feature of this Royals team.