Earlier in this series we looked at Runs Created developed by Bill James in the late 1970s and early 1980s and Batting Runs, a part of the Linear Weights system devised by Pete Palmer in the late 1970s and published in The Hidden Game of Baseball in 1984. Both of these formulas are what Albert and Bennett in Curve Ball call "intuitive" formulas because they attempt to estimate the number of runs created using a model of how baseball is played. However, the former is a non-linear formula since its’ underlying premise is that runs are the product of getting on base and advancing runners, while the latter is a linear formula since it assigns weights to the various offensive events. So now we’re ready to explore Paul Johnson’s Estimated Runs Produced or ERP in the third installment of the series.
History
The story of ERP starts with Bill James 1985 Baseball Abstract. In that book James published an essay by Paul Johnson on ERP after receiving a letter from Johnson explaining his work. Essentially, James published the essay because he found that ERP was simple, on average more accurate than his own Runs Created formula, and because he already knew that his formula overstated runs for teams with both high slugging percentages and on base percentages (something he subsequently fixed as discussed in my previous article).
In a quick study that James did to assess the accuracy of ERP he found that ERP had an average difference of 18.4 runs per team while Runs Created was at 19.3 for 100 teams from 1955 to 1975. On the strength of this James, in a move that should be applauded, felt compelled to provide Johnson with a forum to share his ideas with the baseball analyst community.
In introducing Johnson’s formula James also takes the opportunity to criticize Batting Runs, which he does more fully in the first version of his Historical Baseball Abstract also published in 1985.
"Pete Palmer in The Hidden Game makes a similar claim [for accuracy] for the linear weights method, and Pete is a good friend and an outstanding analyst of the game, but in fact linear weights do meet any acceptable standard of accuracy in assessing an offense."
So what is the formula? In that article Johnson gave as his complete version:
ERP = (2*(TB+BB+HB)+H+SB-(.605*(AB+CS+GIDP-H)))*.16
As you can see the strength of ERP is its simplicity. Only addition, subtraction, and multiplication are required with only eight counting statistics needed. The formula essentially breaks into two sections, the left hand side representing positive offensive accomplishment and right hand side representing negative (we’ll get to the bit about .16 in a minute).
A second strength of ERP is that like Batting Runs it is a linear formula. In other words, when you sum the ERP for all players on a team you get the total ERP calculated for that team. That is not the case with non-linear run estimators like Runs Created and Base Runs, which we’ll look at in our next article in this series. And because ERP is a linear formula Johnson spends the first part of his essay showing how ERP better estimates runs for teams with the combination of high slugging percentages and high on base percentages. He does this not only by looking at teams with the highest number of homeruns and top slugging percentage but also aggregating high scoring World Series games and comparing them with individual players with the same basic profile. For example, he compares the aggregate statistics of 14 World Series games with Babe Ruth’s 1929 season and finds that in those game teams scored 124 runs. His ERP formula estimated 129 runs while Runs Created estimated 148. As mentioned previously, this was a weakness in the Runs Created formula that James explains in his afterword to the essay and has subsequently corrected.
So how did Johnson come up with ERP?
To quote Johnson the formula is “based on charts I made of the number of bases advanced by batters and baserunners on various offensive plays”. From that information Johnson realized that homeruns moved batters and baserunners three times as many bases as did the typical single and that walks advanced the batter and baserunners only two-thirds as many bases as did a single. These insights led to the design of the left-hand side of the formula since:
Home Run = 9 = 2*(4+0+0)+1+0
Single = 3 = 2*(1+0+0)+1+0
Walk = 2 = 2*(0+1+0)+0+0
Values for the other offensive events then follow:
Triple = 7 = 2*(3+0+0)+1+0
Double = 5= 2*(2+0+0)+1+0
Stolen Base = 1 = 2*(0+0+0)+0+1
Hit by Pitch = 2 = 2*(0+0+1)+0+1
As you can seen this formula is indeed intuitive since it attempts to model how runs are scored by looking at the advancement value of each offensive event.
And so the relative weights assigned by Johnson to the events using singles as a baseline were:
Walk = .667
Hit by Pitch = .667
Double = 1.67
Triple = 2.33
Homerun = 3
Stolen Base = .333
If this sounds suspiciously like Batting Runs then you’re on the right track. The weights used in the 1989 version of the formula from Total Baseball were:
Single = .47
Double = .78
Triple = 1.09
Homerun = 1.40
Stolen Base = .30
Walk = .33
Hit by Pitch = .33
Which calculate to weights relative to a single of:
Walk = .702
Hit by Pitch = .702
Double = 1.66
Triple = 2.32
Homerun = 2.98
Stolen Base = .638
As you compare the weights in the two lists you’ll notice that other than the stolen base the relative weights of the offensive events is the same. What Johnson found out with his table was the same information that George Lindsey found from scoring games in the 1950s and that Pete Palmer found when running his simulations in the 1970s. Johnson’s innovation was in expressing these relative weights in an algebraically simpler formula. What Johnson sacrificed for this simplicity was a small amount of precision.
The difference in the relative weight of stolen bases from .333 for Johnson to .638 for Palmer is interesting. As discussed in my previous article, originally Palmer found that the weight for stolen bases actually ranged from .19 to .22 depending on era. He upped the value to .30 on the argument that by and large stolen bases come at strategically more important times and so should be weighted accordingly. While he was no doubt correct in the assessment of the strategic importance of stolen bases it doesn’t make sense to add it to a formula whose goal is to average out the impacts of all sorts of situation-dependant variables. Anyway, eventually Palmer changed his mind and lowered the weight of the stolen base to .22 in the 2004 Baseball Encyclopedia. Using this value the relative weight of the stolen base for Batting Runs is .468, much more inline with what Johnson used.
We now move to the right hand side of the equation.
This side of the formula calculates the negative effect of making outs and therefore represents the context in which the positive weighted events from the left-hand side of the equation occur. This part of the formula counts the number of outs the batter is responsible for by subtracting hits from at bats plus caught stealing and grounded into double plays. The sum of the outs is then multiplied by .605 before being subtracted from the weighted positive offensive events. As a result, the weight of an out relative to a single is -.20 (-.605/3). However, when you look at Batting Runs you notice that the weight of an out is -.25 and so the weight of an out relative to a single is much higher at -.53 (-.25/.47). Why the difference?
The difference here lies in what each formula is attempting to measure. In Batting Runs the end result is the marginal runs or the runs contributed by the batter above what a league average hitter would have supplied whereas ERP, like Runs Created, is attempting to measure the absolute or total number of runs contributed by a batter.
In order for Batting Runs to measure the runs contributed above an average hitter, the formula takes into consideration the value of all outs made and discovers that each out is worth -.25 runs. In the 4.3 runs per game context that Batting Runs was formulated in that means that each out decreases the run potential by .16 runs in terms of shrinking the opportunity for scoring in each inning (4.3 divided by 27 is .16). However, Batting Runs is also taking into consideration the negative value outs have in terms of moving runners along during the inning and this value is then the difference between -.25 and -.16 or -.09. In other words the value of outs can be split into two components; the -.16 that represents the effect an out has on moving closer to the end of an inning, and the -.09 that represents the lack of runner advancement. So the weight of outs relative to singles with respect to advancing runners is -.19 (-.09/.47). This turns out to be the same relative weight Johnson used. If Johnson had used a weight of 1.5 instead of .605 for his outs he would have gotten the same results as Batting Runs and measured instead the marginal runs.
So why does using the smaller relative weight equate to absolute runs? By removing the decreased run potential automatically assigned for each out (-.16) you in essence remove the background noise and judge the hitter or team purely on the basis of the interaction of offensive events and that portion of the outs they make that suppress baserunner advancement. In other words, there is nothing a team can do about those 27 outs they’re going to make each game and so removing their non-discretionary cost results in a measure of the total number of runs scored.
I don’t think Johnson used this kind of logic to come up with his value of .605 and instead simply played around with his formula until he found something that worked. In fact, he says in his essay that,
“The numbers exist only to put proper emphasis on the various events. They are essential to making the equation work, but there’s no need for me to go into how they came to be what they are. I’ll just tell you that it took a hell of a lot of experimenting to settle on the darned things.”
In the final step Johnson take the right side of his equation and subtracts it from the left and then multiples the whole thing by .16. Again, why the .16?
Once you realize that ERP is a simplified version of Batting Runs you can see that the weights assigned by Johnson to offensive events in the left-hand side of his equation multiplied by .16 approximate the weights found by Palmer.
Single = 3*.16 = .48
Double = 5*.16 = .80
Triple = 7*.16 = 1.12
Homerun = 9*.16 = 1.44
Stolen Base = 1*.16 = .16
Walk = 2*.16 = .32
Hit by Pitch = 2*.16 = .32
And as you might have guessed taking the value of outs as -.605 and multiplying it by .16 yields -.097, which not coincidentally is the weight of an out with respect to advancing baserunners in Batting Runs.
However, by using this smaller value for the weight of outs ERP runs into a conceptual problem that Batting Runs does not. It is possible for hitters to accumulate negative ERP values. This doesn’t make sense in a formula that tries to estimate the absolute number of runs contributed by a player. The lower bounds should logically be zero. In fact, the zero-level, the level at which a player has a 0 ERP, is an OPS of between .320 and .330 (depending on the frequency of walks and total bases). What this means is that in practice ERP does not “work” for very restricted run environments. After all, common sense says that a hitter or team with an OPS as low as .320 will still create occasional runs through homeruns and stringing together a few hits. However, the offensive environment this represents is right around a run per game or slightly less. And since a team that scores less than a run per game does not in fact produce any positive offense, you can reasonably assume that a player that contributes at that level would not either.
Johnson went on to give two additional versions of the formula. The first is a simplified version to use when caught stealing, hit batsmen, and double plays grounded into are not available.
ERP2 = (2*(TB+BB)+H+SB-(.615*(AB-H)))*.16
You can see that he simply increased the weight of the out to compensate. The second version Johnson says works better for players with high stolen base totals. I assume he means when caught stealing is not available.
ERP3 = (2*(TB+BB)+H+SB-(.610*(AB+(SB/4) -H)))*.16
This version simply estimate the number of caught stealing by dividing the stolen bases by 4 and adding them to the number of at bats therefore making the outs component larger.
In the end James apparently did not realize that Palmer’s formula he so roundly criticized in The Historical Baseball Abstract was in fact the same formula as ERP in an admittedly simpler guise and with the twist of using a reduced weight for outs. In an ironic comment James says:
“I was originally suspicious of the system when I saw the ‘.16’ at the end of it. Wouldn’t it seem more likely that the most accurate possible system would require multiplication by .15974 or something? My assumption, as I said, was that if better methods were to be developed, they would have to be more complex, more difficult to figure, and that they would grow out of the existing methods.”
In fact, ERP did grow out of an existing method, it’s just that neither Johnson himself nor James realized it at the time.
Because ERP is equivalent to Batting Runs as we’ve shown here most sabermetricians don’t use it and instead rely on the more precise weightings of Batting Runs or Extrapolated Runs (XR) discussed below.
Derivatives
In his original essay Johnson then goes onto offer two extensions to ERP used to calculate the number of runs produced per 162 games. The first is:
ERP/162 = ERP3/(AB+(SB/4)-H)*458
And the second is:
ERP/162 = ERP/(AB+CS+GIDP-H)*474
Apparently, these formulas are an attempt to pro-rate ERP over 162 games and can be used for comparison purposes. These formulas assume a basis of 458 or 474 outs and simply multiply that by the ERP per out. However, I’m not certain where the 458 and 474 came from and Johnson does not say in his essay.
Johnson went on to refine his formula in the STATS 1991 Baseball Scoreboard and christen it “New Estimated Runs Produced” or NERP. The formula presented was:
NERP=(TB/3.15) + ((BB-IBB+HBP-CS-GIDP)/3) + (H/4) + (SB/5) - (AB/11.75)
Or if you prefer:
NERP=TB*.318 + ((BB-IBB+HBP-CS-GIDP)*.333) + (H*.25) + (SB*.2) - (AB*.085)
Once again, this formula is a linear one that can be broken down into left and right hand sides. NERP weights homeruns at 1.52, triples at 1.2, doubles at .89, and singles at .57. It also takes intentional walks out of the equation and weights other single bases gained at .33. Note that stolen bases are now weighted at .2, very similar to Batting Runs. However, the most interesting part is simply subtracting at bats multiplied by .085 from the left hand side of the equation. This seems at first glance to be an arbitrary attempt at estimating the typical number of outs a player makes and to account for the slightly higher weights in the formula. A value of around .065 would be typical be more in line if the weights were lower per the Batting Run formula. However, I don’t want to speculate too much without the original essay in which it was explained.
A few years later Jim Furtado enters the picture. Jim studied ERP and Runs Created and came to the same conclusions about the relationship of ERP and Batting Runs I’ve talked about here. He then went the next step and tried to develop a more accurate linear formula using a combination of regression analysis, comparison to other methods, peer review, and empirical analysis. His result was the Extrapolated Runs (XR) formulas published in the 1999 Big Bad Baseball Annual. He developed three versions as shown here.
XR = (.50 * 1B) + (.72 * 2B) + (1.04 * 3B) + (1.44 * HR) + (.34 * (HP+TBB-IBB)) +(.25 * IBB)+ (.18 * SB) + (-.32 * CS) + (-.090 * (AB - H - K)) + (-.098 * K)+ (-.37 * GIDP) + (.37 * SF) + (.04 * SH)
XRR - Extrapolated Runs Reduced = (.50 * 1B) + (.72 * 2B) + (1.04 * 3B) + (1.44 * HR) + (.33 * (HP+TBB)) + (.18 * SB) + (-.32 * CS) + ((-.098 * (AB - H))
XRB Extrapolated Runs Basic = (.50 * 1B) + (.72 * 2B) + (1.04 * 3B) + (1.44 * HR) + (.34 * (TBB)) + (.18 * SB) + (-.32 * CS) + (-.096 * (AB - H))
As you can see each of these formulas takes the same form as the Batting Runs formula with very similar weights. The difference is that strikeouts are weighted slightly more heavily (-.098) than other outs (-.09) while GIDP and caught stealing are weighted even more heavily. Weighting strikeouts in this way makes logical sense since strikeouts have no opportunity to advance runners.
The outs value here corresponds with the smaller -.09 value discussed previously. It is also interesting that sacrifice flies (SF) and sacrifice bunts (SH) are both included and given positive values. Albert and Bennett in Curve Ball added sacrifice flies to their least squares regression model (p187) and found that it in isolation it correlated strongly with run scoring but its weight was inordinately high and so did not use it in their model. My assumption has always been that sacrifice flies are primarily situation dependent much like RBIs themselves and so generally should not be included in run estimation formulas. Sacrifice bunts as well are typically seen as a net negative drain on offensive production so it is surprising to see them included with even a very small positive coefficient.
Wednesday, October 27, 2004
A Brief History of Run Estimation: Estimated Runs Produced
Posted by Dan Agonistes at 3:29 PM
Subscribe to:
Post Comments (Atom)
1 comment:
I have always felt that most of these formulas do not take into account the actual runs and rbis produced by each player. I have found a linear formula that comes within 3-5% of the actual runs a time scores.
RC = (TB x .2) + (RP x .2) + (SB x .2) + (BB + .1)
RP = RBI + Runs - HR.
It is easy to compute and it comes with 3-5% of the actual runs score by any team. Of course, it should since runs is an actual componet of the formula.
Post a Comment