I had a quick trigger finger...saw your post and felt the need to chime in...LOL
What I originally said:
I think you're also saying that the model itself needs to be re-evaluated. Am I correct about that?
I certainly believe that. The specific example of Mike Morse vs. John Jaso is not an isolated incident. We had the same fight over Raul Ibanez vs. Endy Chavez (looking back on that one now...which of those guys do you suppose would have made the Mariners better in 2009/2010?) and Adam Dunn vs. the glove men and several other similar situations in the past. WAR calls Kendrys Morales an average player...does he feel average to you from the stands when he's carrying the offense?
Here's what I believe:
The problem with WAR is not JUST that it doesn't respect the complexity of the problem, but that it has systematic biases against or for certain types of players that defy basic common sense. When a fleet of Endy Chavezes gets deployed, teams go 61-101 and set records for offensive futility and somehow...their WARs all go down that year. The Mariners are not the only team to have this happen.
I think that the positional adjustments, defensive methodologies, and presentations of the the results need to be adjusted or rethought entirely from the basis of "what are our underlying assumptions?'
For myself, here's what I think is going on:
The position adjustment is based on the assumption that if you take the non-starters at each position and average their performance, you get an estimate of what the minimum production at position is that each team can expect to get back if they go wire trawling. That assumption is manifestly false. It also assumes that the wire average for defense is dead average and that the variation in the value of the positions is entirely driven by the offense. That assumption is also manifestly false.
WAR, in total, assumes that value is linearly additive. I think, though I have not thus far proven, that value is NOT linear...that there are significant non-linearities that explain a lot of the variability in player performance, especially on defense, where each play is a cooperative effort and at the extremes of performance, where you start to produce negative value in either your own team or the opposition.
UZR assumes that defense is eesentially eight men competing against the ball all by themselves. That assumption is clearly not right.
All value metrics that exist today assume that the parks and the strength of schedule have minor impacts on performance and that impacts are ratio multiplicative. Parks adjust scoring by X percent, etc. In fact those impacts are additive, non-linear, and net cumulative...they compound on each other in unpredictable ways.
And finally, all statistics in baseball are, as you say, backwards-looking and ignorant of contexts that cannot be or at least have not been directly measured. Is it easier for a guy to hit when he feels less pressured to carry a club? You might not find that in a study of all players...but it may be true for some subset of players.
I also believe that value metrics need to be presented in scientific notation, including the uncertainties, if you're going to think of yourself as a scientist. I think UZR has a very high uncertainty. All UZR values should be +/- a LOT until you start building a large sample. That may be why Chone Figgins' 4 WAR only happened every once in a while...his types of skills may have high uncertainty and be heavily context-driven at the same time.
In fact, I'm fairly "certain" that soft-skills WAR heroes like Gutierrez, Figgins and E. Chavez, when collected together on one roster, fail as often as they do at least in part because those WAR values are highly uncertain so you can't bank a high stoploss and the failure rate year to year is higher.
When you look at PCA defensive wins - which are more stable than UZR year to year, you see lots and lots of players with a peppering of a few great fielding seasons that could match something Ozzie Smith did...but few players who had a low stoploss for defensive wins the way that Smith did.
For an example of that problem...see Cameron, Mike. I have him as producing, literally, 5.5 defensive wins above margin (3 above average) in 2003 (!!), but his stoploss was closer to 0.5 wins above average (!).
So I think what we have here is a combination of (a) the problem being more complex than WAR makes it and (b) the WAR formula being based on a series of assumptions that clearly are not true all of the time, leading to intrinsic biases and errors that need further refinement and more scientific presentation style.
If local bloggers were more apt to think of WAR as having an error bar...they would be better able to see that some trades that seem horrible to them might work out differently than they expect.
Add new comment
1