FREE hit counter and Internet traffic statistics from freestats.com

Thursday, August 30, 2007

Leveling Off

Today on BP my column is a follow-up on The Myth of the Golden Age published back in January. Essentially, I use pitcher hitting relative to position player hitting to create a "Level Index" relative to the 2006 American League and then adjust player performances for single seasons, careers, and career rate. I purposefully did not use the same technique used by Clay Davenport and David Gassko since in part I wanted to see how well this independent measure would agree.

To give you a little preview here are the top 250 career adjusted leaders in WX1 which is a generalized version of Keith Woolner's Win Expectacny (WX) framework which I wrote about last year. You'll also note that the difference between the original WX1 and version adjusted for year and league difficulty as well as WX1 per 600 plate appearances in both cases is also included.


Name Start End PA AdjWX1 WX1 Diff WX1Rate AdjWX1Rate
Barry Bonds 1986 2006 12026 104.6 108.6 -4.0 5.4 5.2
Babe Ruth 1914 1935 10573 80.4 116.2 -35.7 6.6 4.6
Hank Aaron 1954 1976 13908 78.9 89.3 -10.3 3.9 3.4
Ted Williams 1939 1960 9752 73.7 97.3 -23.6 6.0 4.5
Willie Mays 1951 1973 12449 73.4 83.8 -10.3 4.0 3.5
Stan Musial 1941 1963 12659 69.3 86.0 -16.8 4.1 3.3
Ty Cobb 1905 1928 12978 68.4 102.1 -33.7 4.7 3.2
Mickey Mantle 1951 1968 9896 66.2 81.8 -15.6 5.0 4.0
Frank Robinson 1956 1976 11545 59.2 67.1 -7.8 3.5 3.1
Lou Gehrig 1923 1939 9615 55.9 78.1 -22.2 4.9 3.5
Rogers Hornsby 1915 1937 9427 53.6 76.8 -23.2 4.9 3.4
Rickey Henderson 1979 2003 13248 52.1 55.7 -3.6 2.5 2.4
Mel Ott 1926 1947 11273 51.9 70.2 -18.3 3.7 2.8
Frank Thomas 1990 2006 9084 51.8 53.7 -1.9 3.5 3.4
Jeff Bagwell 1991 2005 9303 50.8 52.9 -2.1 3.4 3.3
Tris Speaker 1907 1928 11885 49.4 73.9 -24.5 3.7 2.5
Joe Morgan 1963 1984 11289 47.8 54.1 -6.3 2.9 2.5
Gary Sheffield 1988 2006 9441 47.1 48.9 -1.8 3.1 3.0
Mike Schmidt 1972 1989 9983 45.0 50.4 -5.4 3.0 2.7
Mark McGwire 1986 2001 7585 44.7 46.3 -1.6 3.7 3.5
Eddie Mathews 1952 1968 10075 44.2 51.8 -7.5 3.1 2.6
Willie McCovey 1959 1980 9617 42.9 48.0 -5.0 3.0 2.7
Eddie Collins 1906 1930 11960 42.0 62.4 -20.4 3.1 2.1
Honus Wagner 1897 1917 11614 41.4 61.3 -19.9 3.2 2.1
Jimmie Foxx 1925 1945 9657 41.3 57.2 -15.8 3.6 2.6
Dick Allen 1963 1977 7298 40.5 45.1 -4.5 3.7 3.3
Manny Ramirez 1993 2006 7705 39.0 39.8 -0.9 3.1 3.0
Chipper Jones 1993 2006 7528 38.9 39.9 -1.0 3.2 3.1
Harmon Killebrew 1954 1975 9783 38.6 45.4 -6.8 2.8 2.4
Mike Piazza 1992 2006 7386 38.1 39.6 -1.5 3.2 3.1
Tony Gwynn 1982 2001 10208 38.0 40.0 -2.0 2.4 2.2
Willie Stargell 1962 1982 8948 38.0 42.6 -4.6 2.9 2.5
Edgar Martinez 1987 2004 8583 37.6 38.8 -1.3 2.7 2.6
Al Kaline 1953 1974 11542 37.3 44.9 -7.7 2.3 1.9
Carl Yastrzemski 1961 1983 13951 37.2 42.8 -5.6 1.8 1.6
Johnny Mize 1936 1953 7319 35.4 46.5 -11.0 3.8 2.9
Tim Raines 1980 2002 10317 35.0 37.6 -2.5 2.2 2.0
Jim Thome 1991 2006 7835 34.8 35.9 -1.0 2.7 2.7
Ken Griffey 1989 2006 9468 34.4 35.6 -1.2 2.3 2.2
Joe DiMaggio 1936 1951 7625 34.1 45.4 -11.3 3.6 2.7
Reggie Jackson 1967 1987 11320 34.1 40.1 -6.0 2.1 1.8
Fred McGriff 1986 2004 10135 33.6 35.0 -1.4 2.1 2.0
Albert Pujols 2001 2006 4014 32.9 33.7 -0.8 5.0 4.9
Alex Rodriguez 1994 2006 7668 32.6 33.3 -0.7 2.6 2.6
Eddie Murray 1977 1997 12799 32.4 35.9 -3.4 1.7 1.5
Billy Williams 1959 1976 10476 32.3 35.8 -3.5 2.1 1.8
Dave Winfield 1973 1995 12333 32.2 35.8 -3.6 1.7 1.6
Jason Giambi 1995 2006 6781 31.6 32.1 -0.5 2.8 2.8
George Brett 1973 1993 11591 31.5 35.5 -4.0 1.8 1.6
Brian Giles 1995 2006 6333 31.0 31.8 -0.8 3.0 2.9
Rafael Palmeiro 1986 2005 11959 30.8 31.8 -1.1 1.6 1.5
Duke Snider 1947 1964 8216 30.4 37.1 -6.7 2.7 2.2
Jack Clark 1975 1992 8201 30.3 32.9 -2.7 2.4 2.2
Paul Waner 1926 1945 10724 30.2 41.6 -11.4 2.3 1.7
Bobby Abreu 1996 2006 6342 30.1 30.8 -0.8 2.9 2.8
Harry Heilmann 1914 1932 8920 30.0 44.3 -14.4 3.0 2.0
Will Clark 1986 2000 8224 29.7 30.9 -1.3 2.3 2.2
Roberto Clemente 1955 1972 10177 29.6 33.8 -4.2 2.0 1.7
Ralph Kiner 1946 1955 6232 29.4 36.2 -6.8 3.5 2.8
Wade Boggs 1982 1999 10717 29.0 30.9 -1.9 1.7 1.6
Larry Walker 1989 2005 7892 28.5 29.6 -1.2 2.3 2.2
Frank Howard 1958 1973 7320 28.4 32.0 -3.6 2.6 2.3
Rod Carew 1967 1985 10525 28.0 32.5 -4.4 1.9 1.6
Reggie Smith 1966 1982 8017 27.9 32.1 -4.2 2.4 2.1
Norm Cash 1958 1974 7820 27.3 31.7 -4.4 2.4 2.1
Sammy Sosa 1989 2005 9386 27.2 28.4 -1.2 1.8 1.7
Joe Jackson 1908 1920 5631 26.6 40.0 -13.4 4.3 2.8
Hank Greenberg 1930 1947 6080 26.5 35.2 -8.7 3.5 2.6
Pete Rose 1963 1986 15754 26.5 31.2 -4.8 1.2 1.0
Vladimir Guerrero 1996 2006 5503 25.7 26.4 -0.6 2.9 2.8
Orlando Cepeda 1958 1974 8593 25.6 28.8 -3.1 2.0 1.8
Sam Crawford 1899 1917 10571 25.5 41.1 -15.5 2.3 1.5
Keith Hernandez 1974 1990 8521 25.2 28.2 -3.0 2.0 1.8
Jim Edmonds 1993 2006 6850 25.2 26.0 -0.9 2.3 2.2
Paul Molitor 1978 1998 12113 25.0 27.0 -1.9 1.3 1.2
Moises Alou 1990 2006 7455 25.0 25.8 -0.8 2.1 2.0
Arky Vaughan 1932 1948 7675 24.5 32.7 -8.2 2.6 1.9
Lance Berkman 1999 2006 4414 24.4 25.0 -0.6 3.4 3.3
John Olerud 1989 2005 8975 24.3 25.2 -0.8 1.7 1.6
Jeff Kent 1992 2006 8388 24.1 24.8 -0.6 1.8 1.7
Todd Helton 1997 2006 6027 24.1 24.7 -0.6 2.5 2.4
Nap Lajoie 1896 1916 10326 24.0 42.2 -18.2 2.5 1.4
Albert Belle 1989 2000 6618 24.0 25.0 -1.1 2.3 2.2
Jimmy Wynn 1963 1977 7983 23.8 27.2 -3.4 2.0 1.8
Carlos Delgado 1993 2006 7103 23.7 24.2 -0.5 2.0 2.0
Darryl Strawberry 1983 1999 6288 23.5 25.0 -1.5 2.4 2.2
Pedro Guerrero 1978 1992 6083 23.1 25.2 -2.1 2.5 2.3
Boog Powell 1961 1977 7781 23.1 26.8 -3.7 2.1 1.8
Rusty Staub 1963 1985 11150 23.1 26.5 -3.4 1.4 1.2
Ken Singleton 1970 1984 8541 23.0 26.9 -3.9 1.9 1.6
Bobby Bonds 1968 1981 8037 23.0 26.3 -3.4 2.0 1.7
Sherry Magee 1904 1919 8437 22.9 33.1 -10.1 2.4 1.6
Joe Torre 1960 1977 8716 22.9 25.3 -2.5 1.7 1.6
Bob Johnson 1933 1945 8023 22.7 30.1 -7.4 2.2 1.7
Ron Santo 1960 1974 9358 22.6 25.6 -3.0 1.6 1.5
Rocky Colavito 1955 1968 7530 22.3 27.0 -4.8 2.2 1.8
Ryan Klesko 1992 2006 6077 21.9 22.7 -0.7 2.2 2.2
Bill Terry 1923 1936 7102 21.5 29.3 -7.8 2.5 1.8
Johnny Bench 1967 1983 8650 21.4 24.5 -3.0 1.7 1.5
Dwight Evans 1972 1991 10516 21.3 23.3 -2.0 1.3 1.2
Barry Larkin 1986 2004 9002 21.3 22.6 -1.3 1.5 1.4
Cesar Cedeno 1970 1986 8077 21.3 24.8 -3.5 1.8 1.6
Babe Herman 1926 1945 6215 20.9 28.6 -7.7 2.8 2.0
Jackie Robinson 1947 1956 5730 20.6 25.1 -4.6 2.6 2.2
Tony Perez 1964 1986 10818 20.5 23.5 -3.1 1.3 1.1
Charlie Keller 1939 1952 4594 20.4 26.9 -6.5 3.5 2.7
Joe Medwick 1932 1948 8116 20.3 27.8 -7.6 2.1 1.5
Mark Grace 1988 2003 9256 20.0 21.0 -1.0 1.4 1.3
Zack Wheat 1909 1927 9919 19.9 29.1 -9.2 1.8 1.2
Jose Canseco 1985 2001 8045 19.9 20.7 -0.8 1.5 1.5
Eric Davis 1984 2001 6114 19.8 20.9 -1.0 2.0 1.9
Enos Slaughter 1938 1959 9047 19.7 25.5 -5.8 1.7 1.3
Scott Rolen 1996 2006 5849 19.6 20.3 -0.6 2.1 2.0
Jose Cruz 1970 1988 8924 19.3 22.1 -2.8 1.5 1.3
Hack Wilson 1923 1934 5536 19.1 26.8 -7.7 2.9 2.1
Darrell Evans 1969 1989 10702 19.0 21.7 -2.7 1.2 1.1
Luis Gonzalez 1990 2006 9511 18.8 19.5 -0.6 1.2 1.2
Bob Elliott 1939 1953 8174 18.7 24.0 -5.3 1.8 1.4
Kevin Mitchell 1984 1998 4669 18.7 19.7 -1.1 2.5 2.4
Chuck Klein 1928 1944 7156 18.6 27.7 -9.0 2.3 1.6
Ernie Banks 1953 1971 10325 18.4 22.6 -4.2 1.3 1.1
Bernie Williams 1991 2006 9014 18.4 19.0 -0.6 1.3 1.2
Greg Luzinski 1970 1984 7430 18.4 21.1 -2.7 1.7 1.5
Craig Biggio 1988 2006 11666 18.4 19.5 -1.2 1.0 0.9
Andre Dawson 1976 1996 10658 18.4 20.7 -2.3 1.2 1.0
Bobby Bonilla 1986 2001 8227 18.3 19.5 -1.2 1.4 1.3
Roberto Alomar 1988 2004 10350 18.1 18.8 -0.7 1.1 1.1
Larry Doby 1947 1959 6264 18.0 25.0 -7.0 2.4 1.7
Rico Carty 1963 1979 6305 17.8 20.4 -2.6 1.9 1.7
George Foster 1969 1986 7760 17.6 20.9 -3.3 1.6 1.4
Bob Watson 1966 1984 6914 17.4 20.2 -2.8 1.8 1.5
Dolph Camilli 1933 1945 6324 17.3 24.1 -6.8 2.3 1.6
Elmer Flick 1898 1910 6315 17.3 27.8 -10.5 2.6 1.6
Tony Oliva 1962 1976 6820 17.3 20.1 -2.9 1.8 1.5
Stan Hack 1932 1947 8485 17.2 23.1 -5.9 1.6 1.2
Jim Rice 1974 1989 8994 17.1 19.5 -2.4 1.3 1.1
Juan Gonzalez 1989 2005 7093 17.1 17.8 -0.7 1.5 1.4
John Kruk 1986 1995 4601 17.1 17.9 -0.8 2.3 2.2
Yogi Berra 1946 1965 8312 17.1 22.8 -5.7 1.6 1.2
Bobby Murcer 1965 1983 7691 17.1 20.0 -2.9 1.6 1.3
Wally Berger 1930 1940 5625 17.0 23.0 -6.0 2.4 1.8
David Justice 1989 2002 6583 16.9 17.7 -0.8 1.6 1.5
Dale Murphy 1976 1993 9012 16.9 19.3 -2.3 1.3 1.1
Joe Adcock 1950 1966 7287 16.8 20.3 -3.5 1.7 1.4
Sid Gordon 1941 1955 5789 16.8 21.1 -4.3 2.2 1.7
Fred Lynn 1974 1990 7893 16.7 19.3 -2.6 1.5 1.3
Minnie Minoso 1949 1980 7518 16.5 22.5 -6.0 1.8 1.3
Goose Goslin 1921 1938 9767 16.5 24.8 -8.2 1.5 1.0
Ellis Burks 1987 2004 8116 16.4 17.1 -0.7 1.3 1.2
Ray Lankford 1990 2004 6638 16.3 17.2 -0.8 1.6 1.5
Jack Fournier 1912 1927 5944 16.3 24.0 -7.7 2.4 1.6
Don Mattingly 1982 1995 7700 16.3 17.5 -1.2 1.4 1.3
Gene Tenace 1969 1983 5434 16.2 18.9 -2.7 2.1 1.8
Gavvy Cravath 1908 1920 4617 16.2 23.2 -7.0 3.0 2.1
Bill Nicholson 1936 1953 6366 16.2 21.6 -5.4 2.0 1.5
Derek Jeter 1995 2006 7596 16.1 16.4 -0.3 1.3 1.3
Gary Matthews 1972 1987 8168 16.1 18.3 -2.2 1.3 1.2
Jeff Heath 1936 1949 5550 16.1 22.3 -6.2 2.4 1.7
Kiki Cuyler 1921 1938 8013 16.1 23.7 -7.6 1.8 1.2
Ron Cey 1971 1987 8282 16.0 18.3 -2.3 1.3 1.2
Shawn Green 1993 2006 7397 15.8 16.4 -0.5 1.3 1.3
Frank Baker 1908 1922 6610 15.8 24.1 -8.3 2.2 1.4
Al Oliver 1968 1985 9696 15.7 18.3 -2.6 1.1 1.0
Danny Tartabull 1984 1997 5825 15.7 16.4 -0.7 1.7 1.6
Dave Parker 1973 1991 10128 15.5 18.0 -2.5 1.1 0.9
Andy Van Slyke 1983 1995 6451 15.5 16.5 -1.1 1.5 1.4
Gil Hodges 1943 1963 8079 15.4 20.1 -4.6 1.5 1.1
Tim Salmon 1992 2006 6972 15.4 16.1 -0.7 1.4 1.3
Ryne Sandberg 1981 1997 9248 15.4 16.9 -1.5 1.1 1.0
Al Simmons 1924 1944 9485 15.2 25.8 -10.6 1.6 1.0
J.D. Drew 1998 2006 3733 15.2 15.6 -0.4 2.5 2.4
Bill Madlock 1973 1987 7304 15.1 17.6 -2.4 1.4 1.2
Kent Hrbek 1981 1994 7111 14.9 15.9 -1.0 1.3 1.3
Bob Allison 1958 1970 5887 14.9 17.4 -2.6 1.8 1.5
Chili Davis 1981 1999 9981 14.8 16.0 -1.1 1.0 0.9
Roger Maris 1957 1968 5808 14.7 17.6 -2.9 1.8 1.5
Rick Monday 1966 1984 7138 14.5 16.9 -2.4 1.4 1.2
Ken Griffey 1973 1991 8034 14.5 16.7 -2.2 1.2 1.1
Richie Ashburn 1948 1962 9693 14.5 18.6 -4.1 1.1 0.9
Roy White 1965 1979 7706 14.5 17.4 -2.9 1.4 1.1
Charlie Gehringer 1924 1942 10187 14.5 22.4 -7.9 1.3 0.9
Nomar Garciaparra 1996 2006 5242 14.4 14.7 -0.3 1.7 1.7
Harold Baines 1980 2001 11078 14.4 15.6 -1.2 0.8 0.8
Augie Galan 1934 1949 6978 14.3 20.3 -5.9 1.7 1.2
Lenny Dykstra 1985 1996 5251 14.3 15.1 -0.9 1.7 1.6
Paul O'Neill 1985 2001 8307 14.2 14.9 -0.7 1.1 1.0
Larry Doyle 1907 1920 7329 14.0 20.5 -6.5 1.7 1.1
Brian Downing 1973 1992 9180 14.0 15.5 -1.5 1.0 0.9
Lou Brock 1961 1979 11186 13.9 16.6 -2.8 0.9 0.7
Bobby Grich 1970 1986 8134 13.8 16.3 -2.4 1.2 1.0
Gabby Hartnett 1922 1941 7262 13.5 19.0 -5.5 1.6 1.1
Tommy Henrich 1937 1950 5375 13.5 18.0 -4.6 2.0 1.5
Derrek Lee 1997 2006 4847 13.4 13.8 -0.4 1.7 1.7
Roy Sievers 1949 1965 7298 13.4 18.9 -5.6 1.6 1.1
Dixie Walker 1931 1949 7650 13.3 18.4 -5.2 1.4 1.0
Kirk Gibson 1979 1995 6595 13.2 14.3 -1.1 1.3 1.2
George Sisler 1915 1930 8965 13.2 24.7 -11.5 1.7 0.9
Earl Torgeson 1947 1961 6029 13.2 16.7 -3.5 1.7 1.3
Mo Vaughn 1991 2003 6302 13.2 13.8 -0.6 1.3 1.3
Steve Garvey 1969 1987 9437 13.1 15.6 -2.5 1.0 0.8
Edd Roush 1913 1931 7305 13.1 21.2 -8.1 1.7 1.1
Frank Chance 1898 1913 4962 13.1 20.2 -7.1 2.4 1.6
Roy Cullenbine 1938 1947 4776 13.0 18.3 -5.2 2.3 1.6
Al Rosen 1947 1956 4347 13.0 18.3 -5.2 2.5 1.8
Ted Kluszewski 1947 1961 6447 13.0 17.0 -4.0 1.6 1.2
Andruw Jones 1996 2006 6542 12.9 13.3 -0.3 1.2 1.2
Bill Dickey 1928 1946 7029 12.7 17.9 -5.1 1.5 1.1
Gene Woodling 1943 1962 6585 12.7 17.5 -4.8 1.6 1.2
David Ortiz 1997 2006 4251 12.7 12.9 -0.2 1.8 1.8
Earl Averill 1929 1941 7182 12.6 18.3 -5.6 1.5 1.1
Bobby Veach 1912 1925 7498 12.6 18.9 -6.3 1.5 1.0
Lefty O'Doul 1919 1934 3636 12.5 17.5 -5.0 2.9 2.1
Lou Whitaker 1977 1995 9947 12.5 13.7 -1.2 0.8 0.8
Ken Williams 1915 1929 5588 12.5 19.4 -6.9 2.1 1.3
Cliff Floyd 1993 2006 5359 12.4 12.9 -0.5 1.4 1.4
Brett Butler 1981 1997 9507 12.4 13.5 -1.1 0.9 0.8
Del Ennis 1946 1959 7909 12.4 16.3 -3.9 1.2 0.9
Dusty Baker 1968 1986 7991 12.3 14.7 -2.4 1.1 0.9
Richie Zisk 1971 1983 5725 12.3 14.3 -2.0 1.5 1.3
Riggs Stephenson 1921 1934 5093 12.3 17.5 -5.2 2.1 1.4
Ernie Lombardi 1931 1947 6303 12.1 16.7 -4.6 1.6 1.2
Fred Clarke 1894 1915 9666 12.1 23.1 -11.0 1.4 0.8
Ted Simmons 1968 1988 9646 12.1 15.2 -3.1 0.9 0.8
Davey Lopes 1972 1987 7309 12.0 13.9 -1.9 1.1 1.0
Jim Bottomley 1922 1937 8312 12.0 19.8 -7.8 1.4 0.9
Chick Hafey 1924 1937 5080 11.9 17.6 -5.7 2.1 1.4
Mickey Cochrane 1925 1937 6177 11.9 16.6 -4.7 1.6 1.2
Kal Daniels 1986 1992 2725 11.8 12.4 -0.6 2.7 2.6
Ron Fairly 1958 1978 8397 11.8 14.2 -2.4 1.0 0.8
Ken Boyer 1955 1969 8248 11.7 14.2 -2.5 1.0 0.8
Billy Hamilton 1890 1901 6741 11.6 19.7 -8.1 1.8 1.0
Mike Donlin 1899 1914 4262 11.6 17.3 -5.7 2.4 1.6
Earle Combs 1924 1935 6490 11.6 16.9 -5.3 1.6 1.1
Miguel Cabrera 2003 2006 2372 11.6 11.8 -0.2 3.0 2.9
Hank Sauer 1941 1959 5380 11.4 14.7 -3.2 1.6 1.3
Reggie Sanders 1991 2006 6888 11.4 12.1 -0.6 1.1 1.0
Kirby Puckett 1984 1995 7775 11.4 12.3 -0.9 0.9 0.9
Lonnie Smith 1978 1994 5860 11.3 12.6 -1.2 1.3 1.2
Howard Johnson 1982 1995 5698 11.3 12.1 -0.8 1.3 1.2
Gary Carter 1974 1992 8951 11.2 13.5 -2.3 0.9 0.7
Ross Youngs 1917 1926 5296 11.2 17.3 -6.1 2.0 1.3
Travis Hafner 2002 2006 2065 11.2 11.3 -0.1 3.3 3.2
Jim Ray Hart 1963 1974 4208 11.1 12.2 -1.0 1.7 1.6
Jason Thompson 1976 1986 5677 11.1 12.9 -1.8 1.4 1.2
Bob Horner 1978 1988 4197 11.0 12.3 -1.2 1.8 1.6
Vic Wertz 1947 1963 6994 11.0 16.1 -5.1 1.4 0.9
Jackie Jensen 1950 1961 6056 10.9 14.9 -3.9 1.5 1.1
Sal Bando 1966 1981 8213 10.8 13.8 -2.9 1.0 0.8
Ferris Fain 1947 1955 4886 10.8 14.8 -4.0 1.8 1.3
Don Mincher 1960 1972 4698 10.8 12.5 -1.7 1.6 1.4

Wednesday, August 29, 2007

Wednesday Links

Here are a couple of things in case you missed them:

  • The 2007 Scouting Report by the Fans for the Fans - Tango is once again collecting data on fan perception of fielding prowess. It's always great to harness the wisdom of crowds.


  • Billy Wagner Breakdown - Ok, so he didn't really break down but Will Carroll's recent comments on his "dead arm" inspired me to sift through some PITCHf/x data for Wagner on BP Unfiltered.
  • Saturday, August 25, 2007

    Jimenez Looking Good

    Tonight Rockies rookie Ubaldo Jimenez turned in a good start for the third consecutive outing beating the Nationals here at Coors Field. I chronicled his arsenal over on the Rocky Mountain SABR site using PITCHf/x data.

    Update: Mike Fast and Sky Kalkman point out that the data used to plot the fastball was incorrect. I inadvertantly used a positive rather than a negative vertical acceleration which caused the pitch to appear to level out. I've since corrected the graphs in the article at RMSABR. My apologies.

    Thursday, August 23, 2007

    Visualization

    My column today on Baseball Prospectus deals with using PITCHf/x data to visualize the trajectory of pitches in much the same way as the actual Gameday application shows each pitch during the game. After discussing how this can be done and plotting a few individual pitches I then aggregate pitch types for a few individuals including Rich Hill, Barry Zito, Roy Halladay, and Derrek Lowe to form a "visual pitch profile" that can be used for comparison. Finally, I look at the complete repertoire of Daisuke Matsuzaka.

    After the article was submitted for publication I learned that SABR member Mat Kovach has also been doing this kind of thing.

    Update: Just saw that Joe P. Sheehan had done something very similar last week at Baseball Analysts. You know what they say about great minds... :)

    Wednesday, August 22, 2007

    Record Breaker!

    I'm sure someone has beat me to the punch but in light of the record-breaking performance of the Texas Rangers who scored 30 runs in the first game of a doubleheader against the Orioles tonight, here are the next 20 highest scoring games for one team.


    6/8/1950 Thu AL SLA (4) at BOS (29)
    4/23/1955 Sat AL CHA (29) at KC1 (6)
    7/6/1929 Sat NL SLN (28) at PHI (6)
    7/7/1923 Sat AL BOS (3) at CLE (27)
    6/11/1985 Tue NL NYN (7) at PHI (26)
    8/18/1995 Fri NL CHN (26) at COL (7)
    4/19/1996 Fri AL BAL (7) at TEX (26)
    9/9/2004 Thu AL KCA (26) at DET (5)
    6/4/1911 Sun NL BSN (3) at CIN (26)
    4/30/1944 Sun NL BRO (8) at NY1 (26)
    8/12/1948 Thu AL CLE (26) at SLA (3)
    8/25/1922 Fri NL PHI (23) at CHN (26)
    6/27/2003 Fri AL FLO (8) at BOS (25)
    6/9/1901 Sun NL NY1 (25) at CIN (13)
    9/23/1901 Mon NL BRO (25) at CIN (6)
    5/24/1936 Sun AL NYA (25) at PHA (2)
    5/11/1930 Sun AL PHA (7) at CLE (25)
    5/18/1912 Sat AL DET (2) at PHA (24)
    6/26/1978 Mon AL BAL (10) at TOR (24)
    8/25/1979 Sat AL CAL (24) at TOR (2)


    Two other notes about the game. They scored those 30 runs in just four separate innings and they have 9 runs in the second game of the doubleheader which breaks that record as well.

    Tuesday, August 21, 2007

    Picking on Pierre

    I just couldn't let this article on the Dodger's Juan Pierre pass by. It starts off like so:

    What is it about the on-base percentage that a player like Juan Pierre -- who leads the Dodgers in at-bats, runs scored, hits, stolen bases, triples and games played -- gets knocked for not having his higher than .350?

    Pierre has been one of the most consistent players in the Dodgers lineup this season. He plays every day (395 consecutive games, which is the longest active streak in the Majors), makes diving catches in center field on a regular basis and steals second just about every time he gets on base, yet his OBP evidently isn't cutting it.


    Essentially, the author is arguing that acquiring playing time, and thus the opportunity to rack up those counting stats, automatically means you're a good hitter. Omar Moreno, playing in all 162 games for the Pirates in 1980 also led his team in all those categories including walks. In the end though he was 9 runs below average offensively because in addition to accumulating 87 runs scored, 13 triples, and 96 stolen bases, he made 551 outs. Yes, 551. Last year for the Cubs Pierre was also 9 runs below average while playing in all 162 games and made 526 outs. This season he's 8 runs below average and has made almost 400 outs.

    While there is certainly a strong link between playing time and offensive performance and being able to stay on the field is in itself valuable, in Pierre's case the perception of performance is apparently what counts.

    This season, Pierre leads the Dodgers with 147 hits. He is fifth in the NL with 45 multi-hit games, he leads the Majors with 14 sacrifice bunts and he's second in the Majors only to Jose Reyes with 50 stolen bases, and yet his OBP supposedly isn't cutting it.

    Well. Multi-hit games are heavily dependant on playing time, sacrifice bunts are nothing to brag about, and while his 50 stolen bases against only 9 caught stealing is very good, historically he's a break even base stealer at best.

    In 2003 and 2004 with the Marlins Pierre was an above average offensive player to the tune of 10 and 14 runs respectively. In those seasons his OBP was a healthy .361 and .374 (he also had a .378 OBP for the Rockies in 2001 but was still 3 runs below average in the pre-humidor era). The reason of course is that as Pierre himself explains:

    When I'm hitting good, my on-base percentage is high and that's just the way it is. The Dodgers knew that before I came here. It is what it is. I just go out there and play the game, and I don't get caught up in all of this.

    Indeed, in those three season his batting averages were .305, .326, and .327. The problem is that what the Dodgers should have paid attention to is that Pierre hadn't cracked .300 since 2004 and going into his age 29 season it wasn't exactly likely he would revert to his form as a 23 through 26 year-old.

    In order to justify his low OBP the author makes much of his ability to disrupt the pitcher and comes up with this quote from Grady Little.

    He's a disruptive force when he's on base. The other team has to be concerned with him regularly and it disrupts the pitcher.

    Unfortunately for the Dodgers there is little evidence and in fact there is some evidence to the contrary as documented in The Book that "disruptive" baserunners tend to disrupt the batter more than the defense.

    Where the author should have focused perhaps was on Pierre's other contributions on the bases. Since 2000 in my four baserunning metrics he's a positive 27.9 runs making his biggest contribution in advancing on hits to the tune of 18.6 runs. When you add those 27 runs to his total runs above average he comes out 1 run to the good. In other words, offensively over the past almost eight seasons he's been average. Unfortunately, his ledger was heavily stacked in 2003 and 2004 and so in the other six seasons he's been below average.

    On the other side of the coin he's also been a below average defender since 2004 and his lack of arm strength is well known. Contrasted with Omar Moreno, who had a monster year with the glove to the tune of saving 17 run over average in 1980 and who was an above average defender until his latter days with the Yankees, Pierre doesn't stack up very well.

    Don't get me wrong. When Pierre was with the Cubs I enjoyed watching him play and was a little sad to see him go (but not enough to wish the Cubs had signed him at that price tag of course).

    Finally, the author sums up his point by saying...

    Whether his OBP is at .324 or .350, Pierre will continue to do the small things for the Dodgers. He bunts, he steals bases, he legs out triples and robs balls in the outfield, yet he'll constantly be scrutinized because he doesn't get on base enough -- that's just the way it's going to be.

    And that's just the problem. The things he can do are indeed small things and when he doesn't get on base those small things simply aren't enough to compensate for the big things like power which he does not posses.

    He's an exciting player to watch no doubt about it. Just don't pretend that he's a plus offensively when at this point in his career he's clearly not.

    Stealing Pays?

    Carl Bialik at the Wall Street Journal who authors a column titled "The Numbers Guy" ran an interesting piece on his blog today related to the trend in stolen bases success rates.

    Until he called my attention to it I hadn't realized that the success rate in 2007 was over 74% marking the highest it has ever been. In looking into the historical trend I found that three of the four previous highs occurred in 2004-2006 and I produced the graph he shows in the post. Because of its steady increase since the second World War I then ventured that it recorded a systemic change in the game perhaps related to increasing athleticism by baserunners coupled to a smaller degree with refinements in technique and strategy.

    For those familiar with the strategic analysis of baserunning you'll notice that the current rate is dangerously close to the oft-repeated claim that 75% is the break even success rate for stolen bases in the big picture (it varies by base, out, and score of course). As a result, using the tenets of game theory, we would expect that as the actual success rate catches up with the break even rate, defenses would be a little more vigilant in protecting against the running game in order to keep the rate close to the break even. Whether they'll be able to do so, however, is dependant to a large degree on the constraints that are inherent in the game. For example, pitching out more often may indeed keep the rate down but at the cost of increasing the productivity of the current batter perhaps making it a wash. Balk rules also constrain attempts by the pitcher to keep the runner close. As a result, if the increase is primarily a result of increasing athleticism on the part of runners outstripping the combined throwing velocity of pitchers and catchers (and their technique), then we might indeed see the rate continue to climb.

    Sunday, August 19, 2007

    Einstein: His Life and Universe

    "I think physicists are the Peter Pans of the human race. They never grow up and they keep their curiosity." - Nobel Prize winner Isidor Isaac Rabi

    Throughout my entire elementary, middle school, high school and college educations I never had occasion to read a real biography of, write a paper on, or let alone actually study the technical work of Albert Einstein. Thinking about it now it sounds hard to believe but my only exposure to the famous equation E=mc2 came from a book on science we had in our home (related to our antiquated encyclopedia set I think but can't recall anymore) that I picked up around the time I was twelve years old or so. I do recall thinking how simple and elegant that formulation is and, once I learned how fast the speed of light really was, trying to wrap my around the idea that such a small amount of matter could contain such a large amount of energy (a kilogram of mass if converted completely into energy would yield 25 billion kilowatt hours of electricity).


    For Einstein the construction of the formula that equated mass and energy was merely a coda to his "miracle year" of 1905 which he published in a three-page paper in September titled "Does the Inertia of a Body Depend on it's Energy Content?" That was last of his papers from that year which also saw him devise the quantum theory of light, help prove the existence of atoms, explain Brownian motion, and of course produce the special theory of relativity in addition to what proved to be the most famous equation in science. As with much of Einstein's work, Walter Isaacson does a masterful job of taking the general reader through not only the chronology but the big ideas behind these discoveries and their historical context in his new biography Einstein: His Life and Universe. I had the pleasure of making Einstein and his world a more or less constant companion over the last two months and can say I thoroughly enjoyed the company.

    The book is the first of what will likely be several new biographies of Einstein that draw on a set of letters newly published in 2006 by the Einstein Papers Project at Caltech. Most of these are personal letters and provide additional insight to Einstein's relationship with family and friends and Isaacson seems to quote from them liberally as he provides a straightforward chronology. That chronology is much more detailed from the time of his birth in 1879 through the completion of the general theory of relativity in 1915 and picks up speed considerably after that and especially from the time of his immigration to the United States in the early 1930s to where only scattered events from his final decade of life are mentioned.

    But even if the book doesn't detail all the chronology, Isaacson does a wonderful job as he did with Benjamin Franklin: An American Life, in chronicling the key intellectual development, struggles, and triumphs of his subject. This is particularly evident in the chapters on the miracle year and the march leading up to the paper on general relativity. In the latter case what we learn in the book and what I found especially fascinating was that Einstein simultaneously pursued parallel physical and mathematical strategies in his quest to generalize relativity. The physical approach had to comport with Newton's laws and Einstein's own intuition about the physical world (on which he heavily relied in his earlier work) while the mathematical strategy was based on work previously done by Bernahrd Reimann among others. As the story is told in the book the mathematical approach, much to Einstein's surprise since his physical intuition served him so well in his earlier successes while he sometimes derided a reliance on pure mathematics, was the winner while the physical approach ultimately cost him several years as he attempted to refine what is known as his Entwurf theory (German for "outline"). This interesting detail has an appeal since it highlights the effectiveness of taking multiple approaches to solving a problem. The trick of course is in being able to work both sides simultaneously and this episode shows how even a genius had some difficulty in doing so.

    As a side note to this narrative, Isaacson chronicles what turned out to be a race with mathematician David Hilbert in the fall of 1915 to publish the final equations after the two had corresponded earlier in the summer. In fact Hilbert published a version of the final equations on November 20 while Einstein didn't deliver the fourth in a series of lectures where he finalized his equations on November 25th. Although there has been some controversy over the priority of general relativity as a result, it was discovered about a decade ago that Hilbert changed his paper in December based on Einstein's version. In any case Hilbert always gave priority to Einstein.

    In addition to the key episodes and papers, the book paints a portrait of Einstein's beliefs about the physical world and how those subtly changed over time. From his early career Einstein was heavily influenced by his reading of Ernst Mach who insisted that concepts only had meaning if one could create an operational definition of them and who derided Newton's views of "absolute space" and "absolute time" as a "monstrosity". This skepticism of received wisdom and adherence to what is observable helped Einstein break through the earlier framework when his contemporaries, who had all the same data, simply could not. But while Einstein would later abandon Mach after his general theory and instead rely more on his intuition about reality, Baruch Spinoza was also a huge influence and his belief in a deterministic universe would stay with Einstein until the end. When confronted with the implications of the revolution he started in quantum physics and its reliance on chance, Einstein famously couldn't believe that at the heart of all things determinism didn't rein despite the growing evidence that he was wrong. As a result he spent the remaining 30 years of his life in a quest to find a unified field theory that would marry electricity and magnetism with gravity and quantum mechanics that culminated with twelve pages of equations he scribbled in his last days in the hospital.

    While Isaacson does a nice job of describing Einstein's sparring with the "young turks" of quantum physics including Niels Bohr and Werner Heisenberg, a second book titled Uncertainty: Einstein, Heisenberg, Bohr, and the Struggle for the Soul of Science by David Lindley documents that struggle in more detail. In particular Lindley includes in-depth portraits of Heisenberg, Bohr, and Erwin Schrodinger among others as well as the debates that raged in the famous Solvay conferences that occurred primarily between the first and second world wars.

    As the years went on, of course, Einstein became known almost as much for his political views as for his scientific accomplishments which were increasingly seen as ancient history. His revulsion at the strict militaristic school he briefly attended as a youth instilled in him a deep disdain both for conformity and the military. It wasn't surprising that he embraced both socialism and pacifism through a refusal of individuals to bear arms, although the latter was markedly softened in the face of German militarism and anti-Semitism in the 1930s. Later he would advocate a world government with its own military which was enforced by his personal lack of nationalistic feeling. He never felt at home in the country of his birth, although he lived in Berlin for many years primarily because his second wife Elsa was from there, as he preferred first Switzerland and later America where he became a citizen in June of 1940. At the same time events in Europe increased his awareness of and solidarity with his Jewish heritage and throughout the book his support of and sometimes entanglement with the Zionist movement (including his dealings with Hebrew University and being offered the presidency of Israel much to David Ben-Gurion's displeasure) are discussed.

    Isaacson also spends a good deal of the last few chapters in recounting Einstein's peripheral role in the development of the atomic bomb after having help draft the letter to President Roosevelt outlining the possibility. More entertainingly he describes the somewhat slapstick efforts by the FBI to obtain information and build a dossier on him while completely missing the one true Soviet spy he consorted with and whom he had an affair after the death of his second wife Elsa. Finally, he also talks a good bit about Einstein's firm resistance to McCarthyism. For Isaacson, Einstein valued political systems that cultivated personal freedom and especially freedom of thought. In America he saw as primary the first amendment and continued to speak out on a variety of issues despite the strictures that the Red Scare induced.

    In the end Isaacson concludes that curiosity was the primary driver for Einstein. And when combined with his imagination and ability to visualize the physical reality behind his equations, it created a genius. Although I haven't commented much on the personal portrait the book paints, as with Franklin Isaacson seems to fairly show multiple sides of his personality. Both his warmth towards acquaintances and strangers as well as his self-induced emotional distance accompanying his often uncaring attitude towards his wives (Isaacson touches on but doesn't dwell on several of Einstein's mistresses through the years) and children, are on display. In that regard, which incidentally is a weakness of Doris Kearns Goodwin's Team of Rivals: The Political Genius of Abraham Lincoln, the book humanizes a figure that for me, as for many others I'm sure, had always been little more than the caricature of the absent-minded professor.

    Friday, August 17, 2007

    Umpires and QuesTec

    Several readers have been asking about the recent study that was reported to show umpire bias by race known as the Hamermesh study. Phil Birnbaum and Mitchel Lichtman have been doing great work in that regard already so I have little to add other than providing a few links for those interested:

  • The original study


  • The Time Magazine piece


  • Phil's first take - he questions the author's findings of statistical significance by examining the core table (table 2) from the original study


  • Phil's follow-up - where he uses to conclude that perhaps and at most 1 in 700 pitches is biased


  • And even more by Phil - here he uses several different tests of significance and it appears there is no racial bias


  • MGL's own study - here he uses a much simpler approach and comes to the tentative conclusion that there are not racial differences that are statistically significant. Update on 8/19: MGL posted some updates to his study here and here and comes to the opposite conclusion. He also notes there is a good discussion of the study at The Sports Economist.


  • One of the side topics that have arisen here is the affect of QuesTec on called strikes. The authors of the Hamermesh study found that for both white and minority pitchers, in non-QuesTec parks pitchers received a higher percentage of strikes when the race of the pitcher and umpire matched than they did in QuesTec parks. White pitchers did not experience this difference when the umpire was non-white although minority pitchers still did.

    This provides an opportunity to look at the PITCHf/x data from this season in QuesTec and non-QuesTec parks to get a more granular feel for what the overall difference might be. While we have data for only 9 of the 11 parks where QuesTec is installed, we still end up with almost 35,000 pitches in QuesTec parks and 63,000 in non-QuesTec parks to analyze. When we do so by comparing the location of the pitch to the strike zone (defined by the PITCHf/x operator for each plate appearance) and give the umpires a 1 inch buffer zone to correspond with the limits of the system, we find the following:


    Park Pitches CS% CB% Agree%
    QuesTec 34427 .8252 .9433 .8790
    Non-QuesTec 62862 .8052 .9488 .8772


    By way of explanation CS% is the called strike percentage defined as the percentage of actual pitches in the strike zone that were actually called strikes. CB% is the called ball percentage defined as the percentage of pitches that were actually out of the strike zone that were called balls and Agree% is the overall percentage of pitches on which PITCHf/x (given the buffer zone) and the umpire agreed.

    By simply examining the confidence intervals it appears that umpires do indeed call more pitches in the zone strikes at QuesTec parks than at non-QuesTec parks. The difference is statistically significant at .05 at amounts to 1 pitch in 50. However, at QuesTec parks umpires don't do as well at identifying balls and end up calling more of them strikes to the tune of 1 in 180 pitches. This result too is statistically significant at .05 indicating that perhaps the biggest effect of QuesTec is simpy to call more strikes.

    Because the factors are working in opposite directions when we add them up the Agree% fails to meet the .05 test. Overall then, if we attribute the entire difference to whether the umpire is in a QuesTec park or not we're talking about a difference of 1 pitch in 550. Of course there may be other factors at work here including the calibration of the system at particular parks that may play a role which I haven't examined.

    Thursday, August 16, 2007

    756

    I wasn't old enough nor interested enough to appreciate Hank Aaron's blast in April of 1974 - a fact for which I have a partial excuse intruding as it did on my sixth birthday (although I do faintly remember the television on and comments made). Later, however, I had always imagined that when 755 was finally reached and breached by someone it would be among the highlights of my baseball fandom.

    As for many others that simply wasn't the case and the event passed with little joy and more than a tinge of disappointment. This was not unlike, I suppose, the feelings experienced by many who had waited a lifetime for the 1986 approach of Halley's comet only to be disappointed. There were no external or internal cheers from my desk as I watched the replays of 755 and 756 on my laptop.

    In understanding why the disappointment lingers, it comes down to a preponderance of the evidence suggesting that Bonds didn't accomplish the feat legitimately in the spirit of fair competition. Who can say what would have happened if he had chosen an alternate path? Perhaps he would have broken the record anyway since he certainly is the greatest offensive player of his generation with or without help. But therein lies the problem. He didn't let us find out and so now the game's greatest number, at least for awhile, (whatever that turns out to be) and hence it' greatest story will forever be accompanied by a shadow.

    The litany of popular defenses including no positive PED test (until recently when Bonds tested positive for amphetamines), the actions of Major League Baseball and the player's union during the entire era, the lack of rock solid performance data quantifying the impact of PEDs, and the fact that many of his peers engaged in the same behavior, are indeed mitigating factors but ones that don't remove the ultimate responsibility of all competitors to play the game "the right way".

    Believe me, I would rather have cheered.

    Wednesday, August 15, 2007

    A Sabermetric Cambrian Explosion

    Several folks have alerted me to this article by Nate DiMeo on Slate.com that talks a bit about PITCHf/x and it's promise. What I like about it is that it does a nice job of showing the range of analysis that has already been done (and I like the quote he used as well from this column) and linking to some of those articles.

    I've written 10 articles on the subject with a couple more already in the works which include:

  • Schrodinger's Bat: Putting the Pedal to the Metal - August 16


  • Schrodinger's Bat: Calling the Balls and Strikes - July 26


  • Schrodinger's Bat: Searching for the Gyroball - July 5


  • Schrodinger's Bat: Playing Favorites - June 28


  • Schrodinger's Bat: Gameday Meets the Knuckleball - June 21


  • Schrodinger's Bat: The Science and Art of Building a Better Pitcher Profile - June 14


  • Schrodinger's Bat: Gameday Triple Play - June 7


  • Schrodinger's Bat: Physics on Display - May 31


  • Schrodinger's Bat: Batter Versus Pitcher, Gameday Style - May 4


  • Schrodinger's Bat: Phil Hughes, Pitch by Pitch - May 10


  • Schrodinger's Bat: The Information Revolution - October 26, 2006


  • In addition Dr. Nathan has created a wonderful page that not only has the most complete data dictionary for the PITCHf/x data but also includes a paper he wrote detailing his own analysis of pitch classification using derived parameters of axis of rotation and spin using a sophisticated model.

    Although it may be difficult to detect, one of my goals in researching and writing about the PITCHf/x data this season has been to explore as many avenues of analysis as possible in this early stage when the system is still being tweaked and the data is incomplete. By doing so we can begin to see which of those ideas for analysis are useful and should be developed further as well as to help spur new ideas by other researchers. This is analogous to one of my favorite intellectual ideas, that of the inverted cone of diversity that I also used to help illuminate the evolving way in which players have been used throughout baseball history, and that Stephen Jay Gould was also fond of. From that earlier column the idea is briefly this:

    In 1909 Charles Doolittle Walcott discovered a treasure trove of wonderfully unique fossils preserved in a layer of shale near the town of Field in British Columbia, specimens that would become known simply as the Burgess Shale. While Walcott placed his specimens in familiar phyla that were known to exist during the period (Middle Cambrian, 505 million years ago), it was a reinvestigation by Harry Blackmore Whittington, Derek Briggs, and Simon Conway Morris of the University of Cambridge in the 1980s that upended that traditional interpretation of the fossils' place in the evolution of life. By inverting the familiar iconography of the cone of increasing diversity in life forms, Whittington, Briggs, and Morris reinterpreted the Burgess Shale as replete with creatures in phyla that are now extinct. In other words, rather than life becoming increasingly more diverse in terms of its basic body plans over successive geologic periods, the Burgess Shale records an initial flowering of experimentation in structures just after the dawn of life before a later decimation or winnowing into the few surviving phyla we see today. Stephen Jay Gould devoted an entire book to this theme as an illustration akin to his theory of punctuated equilibrium in his 1989 book Wonderful Life: The Burgess Shale and the Nature of History.

    So with the introduction of PITCHf/x we're in our own kind of sabermetric Cambrian explosion where ideas are flowering and we're looking for those that survive the selection pressures that prevail.

    Where the analogy breaks down however, is that unlike body plans that are almost fully constrained by what went before, ideas never are and so while many of the paths that we'll subsequently travel will come out in the near future, there will always be a decreasing number that are novel and could therefore fundamentally change the way we look at this data.


    Updated 8/16/2007: Added new article on pitch speeds with runners on base.

    Friday, August 10, 2007

    Ankiel and Bressler

    Back in September of 2005 I wrote a column titled "Rube Bressler Redux?" for The Hardball Times that chronicled the first season of Rick Ankiel's transition from pitcher to hitter. At that time Ankiel had completed a 2005 season that I summarized this way:

    This was the second [his demotion to low-A Quad Cities in late May] stop for Ankiel as he started the year at Double-A Springfield, but did not fare well in his first 60 at-bats, getting off to a 1-for-20 start and hitting around .160 before getting sent down. With the Swing he continued to improve and wound up hitting .270/.368/.514 with 10 doubles and 11 homeruns in 212 plate appearances. His strikeouts were a bit high (37), though he showed a little patience at the plate, collecting 27 walks.

    That good showing in the Midwest League earned him a trip back to Springfield on August 3, and this time it appears he took advantage of it. In the remainder of the season he would hit .300 with 10 homeruns and drive in 28 runs in the 28 games he played, including a 3-for-4 performance with two homeruns and three RBI on the final day of the season. His late season surge even prompted some talk of a September call-up.

    His combined line at Springfield was .243/.295/.515 while overall for the season he hit .259 with 17 doubles, 21 homeruns, and 75 RBIs in 321 at-bats and 85 games. Although he still has a long way to go, I'm sure he and probably the Cardinals viewed this season as a success.

    Late in June Ankiel was asked about making the transition from pitching to the outfield.

    "Not very many people have been successful at it," he replied. "To conquer that quest would be very self-fulfilling."

    Well, after a 2006 season in which he was shelved all year with patellar tendonitis, it appeared that perhaps his window was slipping away. This season, however, he came back at the AAA level and belted 32 homeruns in just over 400 plate appearances before being called up on Thursday and hitting a three-run homer in last night's 5-0 Cardinals victory.

    When I wrote the original article I was interested in how many players had successfully made the transition from full-time pitcher to full-time position player in the history of baseball. It turns out that only five others have ever totaled more than 50 games pitched and 50 games played at other positions in the major leagues. Click on the link above to read about their stories, including that of Rube Bressler for whom the article is titled, but suffice it to say that Ankiel appears as if he'll become the sixth. Whether he goes on to be as successful as any of the others remains to be seen and given his age and plate discipline still seems somewhat remote.

    The interesting thing is that all five of the other players completed their transition before 1940. The final section of the Bressler article details an explanation of just why it is more difficult today to make such transitions than it was in the days of Rube Bressler. Simply put, the argument, first detailed by Stephen J. Gould in an essay discussing the disappearance of the .400 hitter, is that these transitions essentially ended after the war because of the increasing level of play that comes closer to the "right wall" of human ability, coupled with the stabilization of the game itself. In other words, over time baseball players, like other athletes, including sprinters and swimmers, have become better and as the level of play has increased, it has had the side effect of decreasing the variation among players. For players like Bressler and company there was therefore more opportunity to make the transition because good athletes of their ilk could more easily excel beyond the more numerous lesser athletes that populated baseball in the early part of the century.

    The evidence for an increasing level of play was the topic of a column I wrote earlier this year on Baseball Prospectus titled "The Myth of the Golden Age" and in particular one line of evidence fits nicely with Ankiel and Bressler. As described in that column:

    Pitchers are increasingly selected from the amateur ranks based on their extreme right-hand-tail-of-the-distribution excellence in pitching. While there is certainly some athletic and experiential crossover that allows them to hit better than the general population (as evidenced by the best players at early ages being both the best hitters and pitchers), their hitting skill is not selected for in the evolutionary sense and so should remain relatively constant over time. In other words, pitchers simply don't hit as well in the modern game, not because they are not just as skilled (or slightly more so) with the bat as their predecessors, but because the selected skills of all players have increased over time.

    What this all boils down to is that by measuring the relative success of pitchers at the plate we can at the same time, at least indirectly, measure the increasing level of play. The following graph documents that relation using OPS normalized by park and breaks it up by league.



    What I find fascinating about this graph is that it not only shows the increasing difficulty that pitchers have when competing against their peers from the batter's box, it simultaneously gives some information on the relative level of play amongst competing leagues. You'll notice that the American Association and the Union Association of the 1880s and 1890s record higher relative values ostensibly because the leagues were not as difficult. The same applies to the Federal League (1914-1915) and American League relative to the National League from from 1901-1920 and again after integration through the early 1970s. Obviously, after the introduction of the designated hitter in 1973 the league differences can't be measured and so the graph doesn't reflect the subsequent time period. However, if one were to plot those just in the NL, the decline would continue to the point where today the relative production is well under .500.

    All of this has conspired to make Rick Ankiel's story even more compelling and so I for one am going to enjoy it while it lasts.

    Friday, August 03, 2007

    End of the Week

    A busy week after being on vacation and so just a couple links:

  • Chat Transcript - here's the chat transcript from today. Thanks to all those who asked questions and sorry for cutting things a little short.


  • A Physicist Speaks - my column this week was an interview with Dr. Alan Nathan on the intersection of baseball and physics. I followed it up with an addendum on BP's Unfiltered blog.


  • Rocky Mountain SABR - our local SABR chapter has a new website and I made a small contribution following up on the bunting exploits of Willy Taveras. Hopefully it will become a community site for baseball fans on the Front Range and beyond.


  • More PITCHf/x - a wonderful summary of all the articles written on the system by Mike Fast.
  • Wednesday, August 01, 2007

    The Young and the Motionless

    Just wanted to make a quick comment on Joe Sheehan's column on Sunday regarding the Nationals and their signing of Dmitri Young to a two-year $10M contract extension. Joe's analysis is spot-on and this marks the second consecutive year where the Nationals chose to hold on to a veteran having a career year instead of turning him over for prospects - even middling ones - or mediocre major leaguers. This time, however, they made the mistake of continuing to pay the veteran who also happens to play a position that will be occuppied with the return of Nick Johnson.

    In any case Joe mentioned Young's negative contributions in areas outside of his direct offense and so I thought I'd quickly document his baserunning "contributions" over the years.


    Year Opp EqGAR Opp EqSBR Opp EqAAR Opp EqHAR Total
    2007 18 -1.11 0 0.00 20 1.28 30 -0.45 -0.29
    2006 11 0.76 9 -0.54 2 -0.37 8 -1.15 -1.30
    2005 20 -0.89 1 0.09 15 0.18 25 0.22 -0.40
    2004 18 -0.64 1 -0.44 20 0.67 36 -2.94 -3.34
    2003 24 -1.09 2 -0.58 31 0.97 38 -2.39 -3.08
    2002 11 -0.26 1 0.15 8 0.47 17 0.07 0.43
    2001 37 -0.25 13 -2.07 26 0.58 34 -2.75 -4.49
    2000 22 -0.10 4 -2.10 19 -0.29 31 -3.26 -5.76

    161 -3.58 51 -4.21 121 2.22 219 -12.66 -18.23


    Overall, he's at a little more than -18 runs contributed since 2000 which puts him towards the bottom and among the "leaders" in 2000, 2001, 2003, and 2004.

    Chat This Friday

    I'll be chatting on the Baseball Prospectus site on Friday at 12:30 PM eastern, 10:30 AM mountain time. You can submit your questions ahead of time. Your thoughts on bunting, baserunning, PITCHf/x, the trade deadline and anything else you want to talk about are always welcome.