FREE hit counter and Internet traffic statistics from freestats.com

Sunday, July 18, 2004

Relativity and OPS

My contention in a previous post was that OPS (on base + slug) is a useful measure of offensive production because of its simplicity, comparative ability, and correlative value. However, when ranking the greatest single seasons in OPS only three players, Barry Bonds, Babe Ruth, and Ted Williams, made the top 10 seasons. Could this be the result of some bias in favor of these particular players? For example, one obvious thought that comes to mind is that since homeruns have been flying out of the park at an increased rate since 1993 (“chicks dig the long ball”) it is easier for a player like Barry Bonds playing in the context of an expanded run environment to amass large OPS numbers by increasing their slugging percentages (the average number of plate appearances per homerun in the period 1960-1992 was about 47, from 1993-2003 it was 35, in other words a player with 600 plate appearances would hit about 12 and half homeruns in the 1960-1992 period and almost 17 in the 1993-2003 period). Another thought is that perhaps Ted Williams was inordinately helped by Fenway Park with its Green Monster and short right field line and Babe Ruth by playing in a park that was after all, the house that he built.

Correcting for League and Year
To see if there is some hidden bias here we can first correct for the context by calculating the league average OPS for the 10 seasons in question and then normalizing the individual’s OPS against the league average, a concept first introduced in The Hidden Game of Baseball. For example, in 2001 the National League OPS was 756, a very high number historically. By taking Barry Bonds’ OPS of 1379 and dividing it by the league average (1379/756) we can calculate a Normalized OPS (NOPS) of 1.82, or simply 182 for short. By performing the same calculation with Babe Ruth’s 1920 season (the same raw OPS of 1379 in a league where the average was 730) his NOPS comes out to 189, a little ahead of Bonds. Here are the before and after rankings.

Raw OPS
2002 NL Barry Bonds SFN 1381
2001 NL Barry Bonds SFN 1379
1920 AL Babe Ruth NYA 1379
1921 AL Babe Ruth NYA 1359
1923 AL Babe Ruth NYA 1309
1941 AL Ted Williams BOS 1287
2003 NL Barry Bonds SFN 1278
1927 AL Babe Ruth NYA 1258
1957 AL Ted Williams BOS 1257
1926 AL Babe Ruth NYA 1253

Normalized OPS Raw LgOPS NOPS
1920 AL Babe Ruth NYA 1379 730 189
2002 NL Barry Bonds SFN 1381 741 186
2001 NL Barry Bonds SFN 1379 756 182
1921 AL Babe Ruth NYA 1359 761 179
1957 AL Ted Williams BOS 1257 707 178
1923 AL Babe Ruth NYA 1309 734 178
1941 AL Ted Williams BOS 1287 728 177
2003 NL Barry Bonds SFN 1278 749 171
1926 AL Babe Ruth NYA 1253 739 170
1927 AL Babe Ruth NYA 1258 747 168

Since these ten seasons were first picked because of their raw OPS numbers it’s now appropriate to open up the field and recalculate the single season leaders taking into account their normalized OPS.

Normalized OPS Leaders Raw LgOPS NOPS
1920 AL Babe Ruth NYA 1379 730 189
2002 NL Barry Bonds SFN 1381 741 186
2001 NL Barry Bonds SFN 1379 756 182
1921 AL Babe Ruth NYA 1359 761 179
1923 AL Babe Ruth NYA 1309 734 178
1957 AL Ted Williams BOS 1257 707 178
1941 AL Ted Williams BOS 1287 728 177
2003 NL Barry Bonds SFN 1278 749 171
1926 AL Babe Ruth NYA 1253 739 170
1946 AL Ted Williams BOS 1164 690 169

As you can see the same three hitters still dominate the list, however, the distribution has changed somewhat with Ted Williams garnering another spot for his 1946 season in a league with a low OPS of 690 and Babe Ruth losing his 1927 season when the league put up a fairly high OPS of 747. Williams’ 1957 season now also looks better in this light moving from 9th to 5th place. And most obviously Babe Ruth’s 1920 season now tops the list with an NOPS of 189. So in answer to part of our question we can fairly confidently say that these three hitter’s accomplishments were not inordinately helped by playing in leagues that were hitter’s paradises. In fact, the first player not of this ruling triumvirate to make the list is Mickey Mantle with his 1957 season (NOPS of 167). The only other contemporary player to make the top 20 is Mark McGwire with his famous 1998 season and an NOPS of 166 tied for 15th.

However, hitters that played in extremely low scoring run environments should be greatly helped by normalizing OPS. For example, consider Willie McCovey’s 1969 season and Carl Yastrzemski’s 1967 seasons.

Raw OPS LgOPS NOPS
1969 NL Willie McCovey SFN 1108 686 162
1967 AL Carl Yastrzemski BOS 1040 651 160

Before normalization McCovey in 1969 ranked as tied for 70th all-time with a raw OPS of 1108. After correcting for a league in which pitchers dominated with an OPS of 686 he jumps to tied for 23rd with an NOPS of 162. Even more dramatically Yaz in 1967 moves from tied for 166th to tied for 27th place.

Correcting for Ballpark
But adjusting for the run environment of the league in which a player plays is only part of the context. The park in which the player plays his home games is another significant aspect. Intuitively this makes sense. It seems obvious that Larry Walker gets a boost from playing in Coors Field while Willie McCovey was hurt by playing in Candlestick Park. To take this into account sabermetricians have devoted themselves to calculating “park factors” or “park effects” for each of the major league parks. Historically this has been done by calculating a Batter Park Effect or BPF and a Pitcher Park Effect or PPF for each team. The calculation of these effects as documented The Hidden Game of Baseball involves not only comparing the scoring in each park with the scoring at other parks but also taking into account that there is a "home cooking" bias where batters naturally hit and pitch better at home (a fact well documented in Curve Ball). In addition, the calculation allows for the fact that a team's hitters do not have to face its pitchers and vice versa. The BPF and PPF are expressed as a percentage of the league average, in other words a BPF of 1.06 would mean that the batter's home park gives him a 6% advantage over the league and a PPF of .95 means that the park helps pitchers to the tune of 5%. Although factors can and are calculated for different offensive events, homeruns, doubles, triples, etc. the overall BPF is calculated based on runs scored. Here are the BPF and PPF as calculated for 2003 sorted by BPF.

Team BPF PPF
MON NL 118 116
KCA AL 113 112
COL NL 112 111
ARI NL 111 109
TEX AL 110 109
TOR AL 105 104
BOS AL 105 104
HOU NL 104 103
MIL NL 102 102
MIN AL 102 102
TBA AL 100 100
CIN NL 100 100
NYN NL 99 99
PIT NL 99 99
SFN NL 99 100
CHN NL 99 99
CHA AL 99 99
ATL NL 97 97
SEA AL 97 98
NYA AL 96 97
SLN NL 96 97
PHI NL 95 96
BAL AL 95 96
DET AL 95 95
FLO NL 94 94
LAN NL 93 94
ANA AL 93 94
CLE AL 93 94
OAK AL 93 94
SDN NL 91 92

Fortunately, the BPF and PPF have been calculated and are present in the Lahman database and so in order take into account the home park we simply need to multiply the NOPS by the BPF divided by 1,000. Here are the single season NOPS leaders shown previously re-sorted with a new column for normalized for park effects.

Raw OPS NOPS NOPS/PF
2002 NL Barry Bonds SFN 1381 186 204
2001 NL Barry Bonds SFN 1379 182 200
1920 AL Babe Ruth NYA 1379 189 182
1921 AL Babe Ruth NYA 1359 179 175
1923 AL Babe Ruth NYA 1309 178 175
1941 AL Ted Williams BOS 1287 177 174
2003 NL Barry Bonds SFN 1278 171 173
1926 AL Babe Ruth NYA 1253 170 173
1957 AL Ted Williams BOS 1257 178 168
1946 AL Ted Williams BOS 1164 169 159

So given that Bonds has played in a relatively poor park for hitters (BPFs of 91 in 2001 and 2002) gets helped while Williams is hurt by the high BPFs of Fenway Park that are consistently over 100. And so it is once again appropriate to recreate the top 10 list with park effects.

Raw OPS NOPS NOPS/PF
2002 NL Barry Bonds SFN 1381 186 204
2001 NL Barry Bonds SFN 1379 182 200
1920 AL Babe Ruth NYA 1379 189 182
1923 AL Babe Ruth NYA 1309 178 175
1921 AL Babe Ruth NYA 1359 179 175
1941 AL Ted Williams BOS 1287 177 174
2003 NL Barry Bonds SFN 1278 171 173
1927 AL Babe Ruth NYA 1258 168 173
1926 AL Babe Ruth NYA 1253 170 173
1931 AL Babe Ruth NYA 1195 162 172

Since Williams is hurt so much by Fenway Park he almost slips off the list entirely with only his 1941 season remaining. Ruth, however, now adds his 1927 and 1931 seasons when Yankee Stadium held a slight advantage for the pitcher.

Incidentally, Willie McCovey in 1969 moves up to tied for 11th with 165 when considering the tough hitting environment of Candlestick Park while Yaz in 1967 moves down to tied for 102nd at 148. So who is hurt most by taking into account park effects? As you might have guessed it is those who have played for the Colorado Rockies. In fact, Rockies take the top 35 spots when calculating the difference between NOPS and NOPS/PF with Todd Helton’s 2000 season taking top honors when his NOPS was 150 and NOPS/PF was 115. Conversely, Barry Bonds 2001 and 2002 seasons are most helped when park is taken into account raising his score by 18. For Cubs fans like me it’s interesting to note that Sammy Sosa’s 2000 season was tied for 2nd with a 15 point bump up to 149 once park effects were taken into account. Those who follow the Cubs know that weather patterns are the largest variable in whether or not Wrigley Field is a hitter’s delight or a pitcher’s best friend. Those who aren’t Bonds fans might take issue with assuming that the Giants home park hurts Bonds since it was built with a short right field porch with Bonds specifically in mind. Certainly Bonds, being a left-handed hitter, is hurt less by the park than are right handers and so I have a degree of sympathy for that argument. However, I don’t have any data that supports or contradicts the argument at this point. A similar argument could made against Ruth.

So does any of this change our perceptions of who had the greatest single seasons in history? Not really. Bonds, Ruth, and Williams still dominate the top spots and by virtue of Ruth taking 6 of the 10 a strong argument can be made that he was indeed the greatest hitter of them all.

On the other end of the spectrum Niefi Perez has somehow managed to grab two of the worst nine seasons in history with NOPS/PFs of 64 in 2002 and 71 in 1999.

Final Thoughts
Three additional thoughts might come to your mind when considering whether these were the greatest seasons in baseball history.

* Where's the defense? This ranking does not include defense and so can only be used as a ranking of the greatest offensive seasons in history. Although sabermetricians have tried for many years to develop defensive measures that quantify how many runs an individual saves for his team, in the end most of these schemes have difficulty. This is primarily because defense is a much more complex concept in baseball (more akin to defensive backs in football) than offense and doesn’t lead itself to quantification very easily. As Branch Rickey once famously said “There is nothing on earth anybody can do with fielding.” That said there are sabermetric measures such as Defensive Efficiency Rating (DER) and Zone Rating (ZR) that attempt to measure defense more accurately than the traditional counting stats that include put outs, assists, and errors. Bill James, in his book Win Shares, also tries to assign value to defense through a more holistic approach that takes into consideration run prevention at the team level.

* What about opportunities? As mentioned in the previous post one of the strengths of OPS is its simplicity. One of the costs of that simplicity is that OPS has nothing to say about the opportunity a player had to garner his OPS. In other words, which player is more valuable, one with an OPS of 850 who had 600 plate appearances or one with the same OPS who had 200 plate appearances? Obviously, the former since an 850 OPS is pretty good and so finding four players with an 850 OPS over 200 at bats will likely be difficult. In the rankings presented in this post this problem is largely ignored by selecting only those players with 502 or more plate appearances in a season, in other words by only selecting those players who played every day. To address this problem Runs Created per Game (RC/G) takes into account opportunities by considering how many outs a player has consumed – the most valuable resource a team has – while amassing their offensive numbers.

* What about Ty Cobb? Many readers will have noted that Ty Cobb is conspicuously absent from this list and that Cobb is often talked about in the same context with Ruth and Williams. In fact, Cobb first appears on the list tied for 33rd at 160 for his 1917 season right behind Sosa’s 161 in 2001. There are two reasons why this is the case. First, some of Cobb’s perceived value was his foot speed and base stealing ability, neither of which are particularly visible in OPS. Second, OPS is largely a measure of extra-base hitting and Cobb only hit as many as 12 homeruns twice. In his 1917 season, however, he hit 44 doubles, 24 triples, and 6 homeruns. The fact that OPS is correlated so strongly with run scoring indicates that players like Cobb who focused on hitting for average at the expense of power (assuming they could do either of course) did and continue to do a disservice to their teams by forsaking power. In short, if the often told story is true of Cobb hitting three homeruns in a game only to prove to writers that power hitting was not that difficult, then Cobb was mistaken in going back to his former style.

Finally, let's recalculate the 2003 leaders by applying both the correction for league and for park.

Raw OPS LgOPS NOPS/PF
Barry Bonds SFN 1278 749 173

Albert Pujols SLN 1106 749 154
Gary Sheffield ATL 1023 749 141
Jim Edmonds SLN 1002 749 140
Jim Thome PHI 958 749 135
Todd Helton COL 1088 749 129
Jason Giambi NYA 939 761 128
Carlos Delgado TOR 1019 761 128
Chipper Jones ATL 920 749 127
Manny Ramirez BOS 1014 761 127


It should be noted that the historical rankings take into consideration seasons since 1900 for players with 502 or more plate appearances and that for seasons with HBP and SF recorded they were taken into account. In addition, others have calculated similar adjusted values, some much more complicated, including the Adjusted OPS or OPS+ on baseball-reference.com and PRO+ in Total Baseball.

2 comments:

Anonymous said...

You mention that Yankee Stadium in the Ruth era was a slight pitcher's park. This is a case where it seems possible that the same stadium was an extreme pitcher's park for right-handed batters due to the 490ft wall in left center, but at the same time a hitter's park for left-handed batters with the short 210ft right field porch. (It is no accident that the Yankees sought after left-handed pull hitters and that their opponents sought after left-handed pitchers.)

Does or should park factor be nuanced for a stadium like this?

Dan Agonistes said...

That is absolutely the case. Yankee Stadium is an extreme example but your point is correct. You really should have park effects for each outfield position.