FREE hit counter and Internet traffic statistics from freestats.com

Saturday, July 17, 2004

What the heck is OPS?

Baseball Tonight's Harold Reynolds has been known to ask this question in a derisive tone complete with hand waving when discussing statistics. You may have noticed that I've periodically used OPS when discussing the relative merits of players or for comparison. For the sabermetrically challenged OPS is defined as:
 
OPS = On Base Average + Slugging Percentage
 
It couldn't be simpler. Since OPS is simply the addition of these two the long formula would be:
 
OPS = ((H+BB+HBP)/(AB+BB+HBP+SF))+(TB/AB)
 
Of course, since the denominators of the two values being added are not the same it's not really mathematically correct to show OPS as .892 and so in many places you'll see it simply listed as 892.
 
So why use OPS when slugging percentage and on base percentage are so readily available? I'll give three reasons:
 

  • Simplicity. People like things simple and, especially where quantitative analysis is concerned, feel the urge to reduce the essence of the thing being analyzed to a single number. In my own field of software development this reductionism leads people to fix precise dollar amounts on projects when an estimate based on a range of amounts is really as precise as you can be given the nature of software development. Given that, however, I'm not arguing that OPS by itself is somehow better than the values from which it is derived or even that given both it conveys more information. For example, which of the following carries more information, an 800 OPS or a .325 OBA/.475 SLUG? Obviously, the latter since the separation of OPS into its component parts tells you something more - namely that the player has pretty good power but not good on base - than the aggregate number. I am arguing that in shorthand venues it's easier to give a single number than to list 2 or 3.

  •  

  • Comparative Ability. The strength of reductionism is the ability to then apply the single number in order to make comparisons. Given OPS we can now rank players (as all baseball statistics junkies will do) to see who was tops in the majors. Using the Lahman database and SQL Server I calculated the following leader board for 2003.

    Bonds SFN      1278
    Pujols SLN     1106
    Helton COL     1088
    Sheffield ATL  1023
    Delgado TOR    1019
    Ramirez BOS    1014
    Edmonds SLN    1002
    Rodriguez TEX   995
    Nixon BOS       975
    Ortiz BOS       961

    And the single season all-time leaders since 1900:

    2002   NL   Barry Bonds SFN    1381
    2001   NL   Barry Bonds SFN    1379
    1920   AL   Babe Ruth   NYA    1379
    1921   AL   Babe Ruth   NYA    1359
    1923   AL   Babe Ruth   NYA    1309
    1941   AL   Ted WilliamsBOS    1287
    2003   NL   Barry Bonds SFN    1278
    1927   AL   Babe Ruth   NYA    1258
    1957   AL   Ted WilliamsBOS    1257
    1926   AL   Babe Ruth   NYA    1253

  • The list is dominated by just three players. But could it be that the conditions under which these three played helped them dominate? To discover if this is the case in a future post I'll talk about how we can "relativise" OPS to take into account the context in which these hitters performed.

     
  • Correlative Value. Finally, I like OPS because while it is simple to calculate it correlates well with run scoring. This means that unlike other single measures, batting average being the classic example but also slugging percentage and on base percentage taken by themselves, it is a measure of how valuable a player is in creating runs and therefore wins for his team. And so while having both slugging percentage and on base percentage at hand conveys more information about the player's attributes, combining them fairly accurately gives a measure of his value to the team. In fact, in Curve Ball Albert and Bennet note that using OPS "the number of runs scored by a team per game can be predicted within about .15 Runs per Game for two-thirds of the teams." In fact, in order to get a better correlation you have to resort to more complicated formulas such as Runs Created Per Game (RC/G) developed by Bill James, Linear Weights (LWTS) developed by Pete Palmer, Batter Runs Average (BRA defined as the OBA multiplied by the SLUG) developed by Richard Cramer and Pete Palmer, or Total Average (TA) developed by Thomas Boswell. And in fact the differences between these other techniques and OPS is not nearly as great as the difference between OPS and AVG, SLUG, or OBA. In short, OPS is simple enough to calculate on the fly and yet is a useful indicator of performance.

  •  
    For these reasons you'll see me continue to use OPS. However, in order to get a baseline here are the average OPS numbers for 2003 (for non-pitchers):

    AL 762
    
    NL 771
    As a result, an OPS below 700 is bad, above 775 is pretty good, and above 875 is very good. These numbers have been pretty consistent since 1993.




    No comments: