Add new comment

Going to War with WAR

James decides, after 20 years, to speak up about it

.

James has an article up today - in front of the paywall, I'm pretty sure - in which he uses Judge vs Altuve to "ask the questions of a child" about sabermetrics.  You know the truth is?  That I understand micro sabermetrics a lot better than I understand the macro questions.  I might be able to tell you whether a -1 inch horizontal movement is better than a -3 inch movement, but I probably can't tell you how many more/less games you'd win because Ryon Healy played 3B instead of 1B.  In other words, I don't consider any sabe observation too basic to discuss.

In this morning's piece, James reduces all of sabermetrics to 2 'moral' issues:

(1) No stat means anything outside of its connection to W's and L's, and

(2) You have to "normalize" everything in its context, like HR's in Fenway vs Safeco.

Do those two things and you've forwarded your understanding of baseball.

...

For example, on Twitter today he asks, addressing WPA and WAR, 

.

You would not use situational stats to measure FUTURE value. But suppose as hypothetical extreme a 21-year-old hit .700 with a 4.000 OPS in 100 Games at AAA. Assessing his FUTURE, that would make him the most prized property in baseball. But would you vote for him for MVP?

.

In his more developed BJOL piece he says, regarding Altuve and Judge,

.

Baseball-Reference WAR shows the little guy at 8.3, and the big guy at 8.1.   But in reality, they are nowhere near that close.   I am not saying that WAR is a bad statistic or a useless statistic, but it is not a perfect statistic, and in this particular case it is just dead wrong.   It is dead wrong because the creators of that statistic have severed the connection between performance statistics and wins, thus undermining their analysis.

.

James moves on to an interesting concept, the idea of "general" relationship between bases and wins and "normalized" relationships between runs and wins:

.

Look, there is a general relationship between runs and wins, a normal relationship, and there is a specific relationship, based on this specific player and this specific team.   If you evaluate Altuve and Judge by the general and normal relationship of runs to wins, then it appears that Judge is almost even with Altuve.  But if you evaluate them by the specific relationship of Altuve’s runs to the Astros wins and Judge’s runs to the Yankees wins, then Altuve moves up and Judge moves down, and a significant gap opens up between—large enough, in fact, that Judge drops out of the #2 spot, dropping behind Eric Hosmer of Kansas City.

.

He goes on to point out that the Yankee$ "pythag'ed" 102 wins but only won 91.  He gets passionate here:  "IT IS NOT RIGHT TO GIVE THE YANKEES PLAYERS CREDIT FOR WINNING 102 GAMES WHEN IN FACT THEY ONLY WON 91.  This is not a choice.  It is not an option.  It is an error."

This has always been the difference between James and, say, Dave Cameron.  "Pure" algebra guys think in terms of crediting players for theoretical skills they showed.  James (and I) think ALSO in terms of crediting players for what occurred on the field.  Part of the reason for this is --- > many "invisible" things happen between the theoretical and the real.

.

And so he continues,

.

When you express Judge’s RUNS. . .his run contributions. . . when you express his runs as a number of wins, you have to adjust for the fact that there are only 91 wins there, when there should be 102.  (The Astros should have won 101 games and did win 101 games, so that’s not an issue with Altuve.)  But back to the Yankees, one way to do that is to say that the Yankee win contributions, rather than being allowed to add up to 102, must add up to 91.  That’s a good way to do it, and, of course, if you do that, it reduces Judge’s win contribution by 11%    Using WAR, it reduces his win contribution by MORE THAN 11%, because the replacement level remains the same while his win contribution diminishes, so the wins ABOVE THE REPLACEMENT LEVEL are decreased by more like 16%.   Judge drops from 8.1 WAR to 6.8. 

.

James also considers it important that Judge hit poorly in high-leverage situations, while Altuve hit well.  Look, if you want to ignore that because you think it puts you in a better PREDICTIVE position, great.  Often - not always - it WILL make your predictions more accurate.  But let's not pretend the Astros didn't win the World Series this year, what?

But the fun starts here:

.

I have been silent on this issue for more than 20 years, and let me explain why.  In the 1990s I developed Win Shares, while younger analysts developed WAR.   At that time it was my policy not to argue with younger analysts.  I was much more well-known, at that time, than they were, and it’s a one-way street.   When you are at the top of a profession, you don’t speak ill of those who coming along behind you.   It’s petty, and it’s just not done.   Some of those people did take pot shots at me and some didn’t, but. . .well, it’s a one-way street.  I’ve got mine; I’m not pulling up the ladder behind me.  

But that was a long time ago.  We’re not there anymore. WAR is not an upstart statistic; it is the dominant statistic.   We can debate its merits on an equal footing.  

The logic for applying the normal and usual relationship is that deviations from the normal and usual relationship should be attributed to luck.  There is no such thing as an "ability" to hit better when the game is on the line, goes the argument; it is just luck.   It’s not a real ability.   

But. . . I have held my peace on this for 20-some years. . .that argument is just dead wrong.   There are five reasons why it is wrong.

.

Perhaps in Part II we'll look at those 5 things.  But for the moment:  let Dr. Detecto cast his vote, this day, with James:  that theoretical skills are not more important than outcomes on the field.

.

JAMES' 4 1/2 REASONS THAT WAR ASSUMPTIONS ARE DEAD WRONG

1.  We do not "know" that there is no such thing as clutch hitting.  We haven't yet measured it, true.  (False:  provide such a player, and they'll tell you it's a fluke - Dr D)  But I (James) was wrong to join this "consensus."  We can't prove that deviations in clutch hitting are 100% due to chance.  We simply assume they are, because we can't prove that they're not.  We should be agnostic about clutch skill.

2.  It doesn't matter whether it's luck or skill.

3.  There is "luck" everywhere, and sabes don't account for most of it.  A guy draws 90 walks instead of 60, because of the different umpires behind the plate.  Are you going to "normalize" out that luck? But what about BIG luck?  A player is in a car accident and it wrecks his season.  Are you going to normalize that luck?  How do you normalize Mitch Haniger's shoulder issues?

4.  The connection between W's and stats is the only reason you do stats analysis.  When you amputate that connection, you're doing 1962 beat writer analysis - saying that Johnny Bench isn't great because he doesn't hit .300.  Sabes will say a 700-run team in 1965 is a better offense than one that scores 700 runs in 1975.  Why?  Win impact and NO other reason.  But!  If a team wins 80 when they pythag'ed 90 ... we'll pretend they won 90.  It makes no sense.

5.  James uses the example of a 1.400 OPS kid in AAA who played only 80, 100 games.  He's the most valuable property in baseball - but where is he on a WAR chart?

...

At which point Dr. D might innocently bring up James Paxton's 4.6 WAR in 136 innings.

.

UND TAKE ZIS MIT YOU, DEPT.

In this video, Jim Callis (BA's #1 guy) opines that the Vieira money doesn't matter much -- because Ohtani won't care about the money much.  Worth a listen.

Congrats to the Champs,

Dr D

Blog: 
Sabermetrics

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd><p><br>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

shout_filter

  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.