So I was driving to the ballpark yesterday to score the Rockies 4-3 walk-off victory over the Diamondbacks when on ESPN I heard an exchange between the host of the program and ESPN's Peter Pastorelli. As they were running down the chances of various teams making the playoffs Pastrorelli noted that it would be tough for the Red Sox because their pitching is in disarray but also in losing Jason Varitek to a cartiledge tear in his left knee they probably lost (paraphrasing) "a half run per game" because of his handling of the staff.
The idea that there is some skill in game calling that can depress ERAs has been around for a long time as one of the unproven assumptions of baseball. Several years ago Keith Woolner at BP did some research on the topic and wrote about it in The 1999 Baseball Prospectus. The end result of his research (confirmed by others) was that:
"There is no statistical evidence for a large game-calling ability, but that doesn’t preclude that a small ability. For example, a genuine game-calling ability that reduces a pitcher’s ERA by 0.01, resulting in a savings of about 1.6 runs per year for the entire team and could be masked by the statistical variance in the sample size we have to work with. Players would need to play thousands more games than they actually do to have enough data to successfully detect such a skill statistically."
So while there may be some skill involved, the natural variation overwhelms the signal the skill may be giving off which means that for all intents and purposes you may as well make decisions as if there were no skill operating at all.
The reason the myth persists IMHO is that Catcher's ERA (CERA) can easily be found (for example in each team's game notes published each day for the media and made available in the press box) and the inherent variation does indeed sometimes show that the staff performed better under catcher A than catcher B. This difference is misinterpreted by writers and even teams as meaningful when in fact there is no evidence that you would expect the difference to remain given another equally large sample size. If it were the case that you always saw little variation in CERA among a team's backstops it wouldn't have the allure it does. But that variation is a mirage.
This dovetails nicely with what Rany Jazayerli had to say about differences in hitting given small sample sizes quoted yesterday.