Over at “Statistical Modelling,” Sam discusses “Sabermetricians vs. Gut-metricians:”
There’s a little debate going on in baseball right now about whether decisions should be made using statistics (a sabermetrician is a person who studies baseball statistics) or instincts. Two books are widely considered illustrative of the two sides of the debate. Moneyball, by Michael Lewis, is about the Oakland A’s and their general manager Billy Beane. Beane, with the second-lowest payroll in baseball in 2002, set out to put together an affordable team of undervalued players, using a lot of scouting and statistics. Three nights in August, by Buzz Bissinger, is about St. Louis Cardinals’ manager Tony La Russa, and is seen by some as a counter to Moneyball, with La Russa relying much more on guts when making decisions.
There are two problems with this. One is that the distinction just isn’t that sharp: Billy Beane also makes some gut-based decisions, and Tony La Russa looks at statistics.
At this point, its not clear to me if Sam has read the book (which I enjoyed thoroughly.) One of Lewis’ main points in Moneyball is that baseball statistics measure a set of things which were of interest when they started recording stats. They measured things that were “obviously” important, like how many times a player hit the ball, and how many times they made mistakes.
There is great danger is measuring things because they’re easy to measure, and not validating that they are either causitive or correlated with what you want to measure.
Sabermetrics is an attempt to find things which correlate with teams winning games, and to measure those things.
For example, “saves” by pitchers are highly overrated. Beane noticed this, and used it to improve the sale value of pitchers he didn’t want. Batting average is overrated, because there are two ways to get on base: To get a hit, and to walk. A player who walks often is worth more than one who walks back to the dugout. Errors are a completely subjective measure of “Did that player make a play that I think he could have made?” And they don’t correlate with wins or losses.
Another important point is that, in contrast to Sam’s assertion, there is time to regress baseball games. You can’t do it in real time, but there are very, very few unique situations in baseball. You can run all the stats you want on a compute farm, and then see if the numbers change over the course of a season. “Go for the steal?” Well, no. Steals produce too many outs, and not enough runs. It may be exciting, but it won’t win you the game.
(Thanks to Nat Gertler reverse engineering Amazon’s image engine, and sharing his knowledge in “Abusing Amazon Images.”)