Question of the day: Is last season's batting average a good predictor of next year's?
Answer: Intuitively, the answer to this question would seem to be yes. After all, teams tend to pay players based on last year's performance and what is more indicative than batting average?
Well, to answer the question I took a look at the 262 players who had more than 200 plate appearances in both 2003 and 2004. I then calculated batting average, slugging percentage, on base percentage, and OPS (on base plus slugging) and ran a quick regression using Excel. The correlation coefficients were as follows:
In other words batting average varies from season to season more than either slugging percentage or on base percentage. As a result, when you consider how a player might perform in 2005 it would be better to look at his 2004 slugging percentage than his 2004 batting average.
But why is this the case? Why is their more variation in batting average than in slugging percentage or on base percentage? One reasonable interpretation is that there is more variation because there is more luck involved in batting average than in either of the other two.
First, consider batting average versus slugging percentage. If you think about games that you watch this actually makes perfect sense. After all, slugging percentage is calculated by using total bases rather than simply hits. Therefore singles comprise one-fourth of the components used to compute batting average whereas they comprise only one-tenth of the components used to compute slugging percentage. And when you watch a ballgame which kind of hit is more likely to the result of a broken-bat flair, a topper, a lucky bounce, a "seeing-eye" grounder, or a Texas-leaguer? A base hit of course. Doubles and homeruns on the other hand are less likely to occur as the result of a lucky bounce or fortunate placement and more often occur because the batter put a good swing on the ball and hit it solidly on a line or a deep fly ball.
This is backed up by looking at play-by-play data for 2004 as summarized in the following tables.
BIP Pct Outs Pct
Ground 60234 42% 45433 75%
Line 25654 18% 6712 26%
Fly 47693 33% 37186 78%
Pop 11010 8% 10784 98%
Single Double Triple Homerun
Ground 45% 14% 42% 0%
Line 46% 50% 28% 14%
Fly 8% 35% 30% 86%
Pop 1% 0% 0% 0%
When you summarize this you can see that 69% of the balls put into play were converted into outs.
You should also notice that only 18% of the balls put in play were line drives and they are converted into outs only a quarter of the time. On the other hand balls hit on the ground are converted into outs fully 75% of the time. When you look at singles as contrasted with doubles you see that 45% of base hits were on the ground while 46% were line drives. For doubles the percentages were 14% and 50% respectively. What this means is that 21.5% of ground balls turn into singles while only 2% of ground balls turn into doubles. Since the odds of turning a ground ball into an out are higher it follows that ground balls are inherently easier to convert into outs and therefore that there is a much larger element of luck in determining which ground balls end up getting through the infield for singles and which are turned into outs. In fact, since 45% of the singles are ground balls and 1% pop-ups, one might conclude that almost half the singles hit are predominately luck.
When looking at doubles we find the opposite. Only 14% are the result of grounders and so a much larger percentage can be attributed to solidly hit balls that are more likely to reflect a batters skill. Of course the same argument can be made for homeruns. The larger amount of luck in the accumulation of singles means larger variation in batting average as opposed to slugging percentage and thus the likelihood of less correlation from year to year for a specific batter.
It should be noted that with triples the argument doesn't really hold. Most triples are technically grounders (42%) hit down the line with a smaller number (30%) being fly balls in the gap or in the corners. Still, 28% are line drives. And because triples are also a function of a batter's running speed, it's not really possible to conclude that triples are more a reflection of skill than luck. Since triples make up only 3.2% of all hits while doubles and homeruns make up 32%, they don't play a large role in the conclusion that slugging percentage is more of a reflection of a hitter's skill than batting average is.
Of course, I'm not arguing that there is no ability of some batters to hit grounders that get through the infield. Harder hit grounders will tend to get through more often. However, this is likely offset by the additional time infielders have to throw the runner out when they knock down a grounder and the ability of fast runners to beat out slowly or weakly hit ground balls.
The reasoning for less variability and therefore more skill reflected in on base percentage than batting average is straightforward. Since the publication of the Baseball Abstracts and The Sinister First Baseman in the early 1980s baseball analysts have become increasingly aware that walks are as much a function of the hitter as the pitcher. This view has slowly made its way to the front offices of many major league teams to the point that teams have now begun to consider strike zone judgment in their scouting scheme.
Historically this wasn't the case. Since the batter was the passive actor in the base on balls it was long assumed that walks were purely or at least mostly under the control of the pitcher. To a certain degree this attitude is still prevalent although in a modified form. For example, this quote was taken from an article by a Toronto beat writer in 2003:
"Clearly, the easiest positive statistic for mediocre hitters, one that requires keeping the bat glued to your shoulder, instead of the traditional hand-eye induced ball-whacking (which is far more exciting), is the ability to draw walks."
But throwbacks aside, it is now generally recognized that a batter's ability to control the strike zone is definitely a skill and it makes sense that it would not be as subject to variability as batting average, a statistic heavily dependent on singles which as we've shown are themselves heavily dependent on luck.
For more information on this question check out Jim Albert's paper, Batting Average: Does it Represent Ability or Luck?. Although Albert did not look at slugging percentage specifically, he did show that strikeout, walk, homerun, on base percentage, and in-play average (batting average on balls put in play) are all more strongly correlated from year to year. In his analysis strikeout, walk, and homerun rates all had strong correlations with coefficients of .7 or greater.