OK...that's a good way to simplify the roof of concept. Let me put it to you this way:
Derek Jeter in 2004 had a UZR of -19 defensive runs at short. The second worst at the position...while contributing 46 runs above replacement. The Cameron approach would say Jeter was worth 27 runs above the replacement level shortstop. That's 2.7 wins. PCA on the other hand...which is a dual-replacement metric, say Jeter was worth 7.6 offensive wins (about 4.5 offensive wins above the .350 replacement level) and 1.6 defensive wins (if you're keeping score, the average shortstop in Jeter's playing time would be worth 2.7 defensive wins, so PCA says Jeter was -11 runs to average so a tad more optimistic)...but even if we took the UZR version as more accurate...that would be 0.8 defensive wins. For a total value of 3.4 + 0.6 or 5.1 WAR (rather than 2.7 WAR).
Let's compare that to a defensive genius like....(drumroll...) ADAM EVERETT.
UZR gave Everett +25 runs/150 games in 2004. Yes...that's a freakin' lot. Unfortunately, Everett is a TERRIBLE hitter. If Everett were a full time hitter, he'd have been worth about 13 runs above the .350 replacement level and 25 runs above average for defense...making Everett worth (um...this is not good guys) 3.8 WAR (!) the Cameron way...whereas PCA would conclude that Everett in full time play was worth something like 1.3 offensive wins and 5.2 defensive wins (6.5 WAR)
OK...now what makes more sense...Jeter being 70% of the player that Everett is in 2004 or 85% of the player that Everett was?
I can hear Doc saying "nay varily...the right answer is...NEITHER"...but I was granting the .350 replacement level. PCA doesn't use the .350 replacement level because that's not how you define real value. The correct way is at the .250 replacement level or lower for offense.
Watch what happens:
Jeter is worth 8.4 wins accepting UZR's evaluation and PCA's offensive margin
Everett is worth 7.6 wins using the same assumptions.
Now what makes more sense?
This is the rational approach, not only because the answers look more logical, but because the defensive margin is DEFINITELY NOT .500 in the real world...and that's REAL VALUE that we need to consider...the REAL DIFFERNCE between a player and thr ACTUAL MARGIN...that's what matters...not what some amalgum replacement player would do. The defensive margin should be higher than the batting margin (because fielding is only 40% of team defense, whichi means the range of performance will be less than the range of team runs allowed...and team runs allowed is the stat that has the .250 margin)...but they BOTH need to be WAAAYYY lower than is commonly applied to roster decision making.
When you do the math in a way that is consistent with the real value players produce...you get answers that make a world more sense.
Here are SABRMatt's evaluations of Wins Above Replacement for 2004 Center Fielders, compared to those typically found on a site like Hardball Times':
Player |
EqG |
PA |
Off-M |
Def-M |
Win-M |
Off-A |
Def-A |
Win-A |
|
Carlos |
Beltran |
162 |
708 |
12.4 |
1.5 |
13.9 |
10.6 |
-1.4 |
9.2 |
Jim |
Edmonds |
143 |
612 |
11.8 |
2.1 |
13.9 |
10.2 |
-0.3 |
9.9 |
Andruw |
Jones |
150 |
646 |
4.3 |
6.5 |
10.8 |
2.6 |
3.9 |
6.5 |
Aaron |
Rowand |
121 |
534 |
8.3 |
1.9 |
10.2 |
6.9 |
-0.2 |
6.7 |
Mark |
Kotsay |
137 |
673 |
6.2 |
3.4 |
9.6 |
4.5 |
1.2 |
5.7 |
Mike |
Cameron |
134 |
562 |
4.7 |
4.2 |
8.9 |
3.3 |
2.0 |
5.3 |
Johnny |
Damon |
140 |
702 |
7.3 |
1.4 |
8.7 |
5.5 |
-1.0 |
4.5 |
Juan |
Pierre |
162 |
748 |
7.2 |
1.3 |
8.5 |
5.3 |
-1.4 |
3.9 |
Steve |
Finley |
155 |
706 |
6.1 |
2.0 |
8.1 |
4.3 |
-0.6 |
3.7 |
Vernon |
Wells |
129 |
590 |
4.7 |
3.0 |
7.7 |
3.2 |
0.8 |
4.0 |
Randy |
Winn |
117 |
703 |
5.3 |
2.3 |
7.6 |
3.5 |
0.3 |
3.8 |
Torii |
Hunter |
122 |
569 |
3.1 |
4.5 |
7.6 |
1.6 |
2.3 |
3.9 |
Corey |
Patterson |
153 |
687 |
4.2 |
2.7 |
6.9 |
2.4 |
0.1 |
2.5 |
Scott |
Podsednik |
149 |
712 |
4.0 |
2.5 |
6.5 |
2.2 |
-0.1 |
2.1 |
Rocco |
Baldelli |
122 |
565 |
4.1 |
2.1 |
6.2 |
2.6 |
0.1 |
2.7 |
Endy |
Chavez |
120 |
547 |
3.4 |
2.3 |
5.7 |
2.0 |
-0.2 |
1.8 |
Jay |
Payton |
120 |
511 |
2.7 |
2.7 |
5.4 |
1.4 |
0.7 |
2.1 |
Marquis |
Grissom |
133 |
606 |
2.6 |
1.9 |
4.5 |
1.0 |
-0.3 |
0.7 |
Tike |
Redman |
140 |
581 |
3.2 |
1.3 |
4.5 |
1.7 |
-1.0 |
0.7 |
Laynce |
Nix |
91 |
400 |
0.3 |
2.6 |
2.9 |
-0.7 |
1.0 |
0.3 |
MARGIN |
** |
** |
0.250 |
0.350 |
** |
0.357 |
0.500 |
|
D-O-V readers are familiar with the PCA evaluations that he uses to come up with offensive numbers. The defensive evaluations involve his recommended re-set of what "Replacement Level Player" means with respect to defense.
Matty sez,
Hey Doc…over at Bleeding Blue and Teal I’m up to my usual arguments re: defensive metrics and their use…
See this thread with the good buddies at BBT. Matt buries some fascinating stuff in the comments.
Now, for those just joining us, my "Baseball for Idiots" 8th-grade translations :- ) is not because I think anybody's dumb. We don't mean to be condescending. It's that *I myself* like for teachers -- such as Matt -- to speak simply, so that I can follow *quickly*.
I think that most college profs make the subjects wayyyyyyyy harder to understand than they have to be. The TV commercials speak to 6th-grade level not because viewers are dumb, but because that is what is effective.
/birdwalk
Basically…I think you’re right about some of the illogical conclusions that get drawn by [cyber-Seattle's] brand of defensive analysis (E5 Hinske = Adam Dunn? Raul Ibanez = defensively gifted 4th outfielder in part time play?)…but I think you’re wrong about the reason those conclusions get made.
It’s not that saberdweebs are overvaluing defense…it’s that they’re using bad logic in combining offense and defense. The argument goes that a replacement level bat will be, on average, an average fielder. And while this may appear to make sense on the surface…it’s a TERRIBLE way to scale offense and defense together to define the value of a player. You’re going to systematically overrate part time glove men, systematically underrate full time poor fielding sluggers, and that’s why the lists look wonky ...
In other words, Matt thinks that (1) I'm smelling a rat here, because there actually is one ... but that (2) my first guess at where the rat is might be off.
Fair enough. That's usually how science progresses (not that we're making any claims for D-O-V as science). You sense an issue, take a first cut, and then after you get real-world feedback, you refine your guesses.
Matt thinks I'm Ptolemy :- ) sensing the nature of the solar system, but having an Earth and Sun out of place here and there. LOL. He wants to make a Copernican adjustment that makes it all clear. BRAVO!
...........
Could be that I'm Ptolemy here -- I often am, and so is everybody else in baseball.
But I doubt it. The same system that had Bobby Abreu at -25 runs last year had him at -1 runs the year before, and that volatility is WHY I'm not buying the -25. But that's another subject.
The reason this happens…the baseline for offensive production is roughly a .350 winning percentage…but the baseline for defensive performance is .500…that’s illogical…no matter whether the true replacement player is a .500 fielder or not. So you’ve been crusading for a very good reason…but with a poor understanding of the cause. (LOL, gotta love it - Dr D)
I value defense more than [USSM] does. I value defense more than 99% of saberdweebs will. But my list of left fielders ranked by value will not place a 4th outfielder ahead of Raul Ibanez or E5 Hinske ahead of Adam Dunn because I’m not combining two different scales as though they were linearly related.
Matt's suggestion is an interesting one, and maybe it really IS the earth-and-sun swap that will bring all of the saberdudes into alignment. I hope so.
In nice, oversimplified terms, Matt asks: what happens to all of the relative overall values if you add (say) +1 defensive win to *everybody* -- +1 win pro-rated to full-time play, of course? If you do that, don't the overall values make a lot more sense then?
.
We have two basic questions, in response to Matt's theory:
Q1. Was there ever enough justification for assuming a defensive player to start out at "50" -- major-league average? Was this RLP Defense = .500 belief TRUE?
Q2. Whether the belief (that RLP Defense = .500) is True or NOT, what DOES happen if we adopt Matt's "pay structure" and re-grade all the ballplayers by "adding" defensive value to every player? If we "re-calibrate" the defense, as though adding SAT points to every student, what do the relative values look like in this case?
I expect that most sabermetricians will spend their time on Q1. But the real progress will be made if we focus on question 2.
We'll wind up arguing a whole bunch, whether Wlad Balentien actually comes into the major leagues (at a $425k salary) playing "50" defense. And that is a thick, sharp blackberry bush to hack through. (I personally think Matt may be right here: if a player *is* playing quality defense, you're probably not getting it for $425k.)
We'll argue whether [RLP Defense = .500] a lot. But I would be much more interested in simply nailing down what the values look like, in both Conventional W/$ Schematic A, and in Matt's new SABRMatt W/$ Schematic B.
After all, Jack Zduriencik looks at two values for Adam Dunn now: the one if -20 runs is correct, and the one if -7 runs is correct, and then he uses his intuition. There's no reason he can't do the same on a more global scale.
..................
By the way, you can join in the discussion right here, whether you're an expert or not. Do YOU think that it's ACCURATE to assume that a $425,000 player typically gives you ML-quality defense?
Whether it is or not, Matt's system could still be the right way to scale players to one another. But it's an interesting question that anybody can kick around.
Good stuff Matty,
Dr D
Comments
Lovin' it Matt. Gracias!
When you get time, could you do, let's say, the 10 players with the most playing time at any given AL position, valued by schematic A and schematic M? That is what would make your concept accessible, IMHO.
Good stuff Matt.
With a lower replacement level how much would 1 WAR be worth on the FA market by your system?
Any chance we get a PCA offensive/defensive database sometime?
I'll have to use PCA to estimate schematic A...I don't have the full UZR record at my fingertips...I found Jeter and Everett in an old article of mine from 2004 about how sucky Jeter was defensively and how that was compromising his HOF value...LOL
Let's have some fun with center fielders...that's a position that's prone to BIG swings in defensive value...bigger swings even than the average value of the position.
The 29 center fielders with the most playing time in baseball in 2004:
Player EqG PA Off-M Def-M Win-M Off-A Def-A Win-A
Carlos Beltran 162 708 12.4 1.5 13.9 10.6 -1.4 9.2
Jim Edmonds 143 612 11.8 2.1 13.9 10.2 -0.3 9.9
Andruw Jones 150 646 4.3 6.5 10.8 2.6 3.9 6.5
Aaron Rowand 121 534 8.3 1.9 10.2 6.9 -0.2 6.7
Mark Kotsay 137 673 6.2 3.4 9.6 4.5 1.2 5.7
Mike Cameron 134 562 4.7 4.2 8.9 3.3 2.0 5.3
Johnny Damon 140 702 7.3 1.4 8.7 5.5 -1.0 4.5
Juan Pierre 162 748 7.2 1.3 8.5 5.3 -1.4 3.9
Steve Finley 155 706 6.1 2.0 8.1 4.3 -0.6 3.7
Vernon Wells 129 590 4.7 3.0 7.7 3.2 0.8 4.0
Randy Winn 117 703 5.3 2.3 7.6 3.5 0.3 3.8
Torii Hunter 122 569 3.1 4.5 7.6 1.6 2.3 3.9
Corey Patterson 153 687 4.2 2.7 6.9 2.4 0.1 2.5
Scott Podsednik 149 712 4.0 2.5 6.5 2.2 -0.1 2.1
Rocco Baldelli 122 565 4.1 2.1 6.2 2.6 0.1 2.7
Endy Chavez 120 547 3.4 2.3 5.7 2.0 -0.2 1.8
Jay Payton 120 511 2.7 2.7 5.4 1.4 0.7 2.1
Marquis Grissom 133 606 2.6 1.9 4.5 1.0 -0.3 0.7
Tike Redman 140 581 3.2 1.3 4.5 1.7 -1.0 0.7
Laynce Nix 91 400 0.3 2.6 2.9 -0.7 1.0 0.3
MARGIN ** ** 0.250 0.350 ** 0.357 0.500 **
Now I'll grant you that this doesn't look like a gigantic adjustment...but it can be VERY important in some cases. The ordinal rankings change slightly, especially in the middle of the list...but the key is really the relative evaluations. Compare each player to the top player in the pool. Which list makes more sense to the group?
Should we be rating wins relative to the zero value margin (all of the values go up, which means dollars per win go down of course)...or should we b emixing scales, reducing the changes requires to significantly alter the value ratio calculations.
I should point out as well that it really shows up with guys like Laynce Nix (and the part timers below him). Is Nix, in 2004, worth 60% of what Grissom is worth? or less tghan 40%? The part time players show up in weird places on the second list because defense being rated relative to average means you'll get weirder and weirder value judgments the less playing time the player has (is a fielder who saved one win relative to average in 400 innings the same as one who saved that 1 win in 700 innings?
Well, I've never bothered to John Benson the player pool, by dividing the total benefit pool against available dollars, but ....
1. Usually you see 900-1000 total Wins Above Replacement in the majors ...
2. Given what people assume about Replacement Level (a 50-win team) ... and
3. About $4.8-5.5M per WAR in the FA market, given total dollars available.
So if you add another WAR to the pool for *every defensive position, for every team* (162 games x 9 innings), you add about 240 WAR leaguewide.
Meaning you have deflated the $ per WAR by about 25%.
WAR in Matt's system might go to about $4M as opposed to almost $5M, with every position player gaining +1 WAR or so. Sound about right, Matt?
Taro...my web programming skills suck. It's not the kind of programming I customarily do. Otherwise I'd have already made some kind of browsable PCA website.
I am also weak on automation of the database update cycle (each year, new data is added and you have to somehow get the new year into the existing database)
I have a plan to account for that...it's going to require me to make a three-layered database when I do my rebuild after the 18th of January (that's when I get back to Stony Brook and get access to my main computer)...
First layer: The database tables in their exact unaltered raw formats as taken from the web sources.
Second layer: variable name x-ref tables to link the raw formats to each other and to the sabermetric layer which is...
Third Layer: Sabermetrics...these are the tables where I can change formats and the order of variables...I can drop variables I don't need, make calcualtions etc.
That way when I import new data, all I'll have to do is run the sabermetric scripts on the new data (which will be flagged as new when it gets imported into the raw DB) and bingo front row...the DB will have been updated.
This is obviously going to take some time to code though. Bear with me. LOL
Once I have the updated database, I'll re-calculate PCA with all of my more modern ideas and then I'll be in the market for help getting it posted to the web.
Not quite, Doc.
I'm not just adding defensive wins...I'm adding offensive wins too.
There are 1000 WAR in column A worth 5 mil per WAR...but there are as many WAR in PCA as there are wins in a season...that's 2400 or so. So you're looking at the value of a WAR being cut by 60%...to something more likw 2 mil per WAR
EDIT TO ADD:
there are 2430 ins in a typical modern season, 900 WAR available the Cameron way...that would mean if the typical WAR the Cameron way was worth 4.4 mil on the free agent market, 2.1 mil in arbitration and 1.2 mil club controlled (roughly averaging 2.3 mil/WAR for the total player pool), then the new $/WAR average NET would be 0.9 mil per WAR. For free agents it would be about 1.6 mil per WAR.
Uh Doc...those are 2004 Center Fielders...can you fix that in the post? Thanks...just don't want people confused.
Think we got it formatted for yer, amigo.
What would have really made it scannable, would have been $$ values... any chance of getting those? :- )
Some of your most interesting stuff, and that's sayin' a lot.
Now keep in mind I'm using 2008 money figures for a 2004 list, which is inaccurate.
But I'll treat each player like he's the status he actually was (arbitration, club controlled or free agent) and give dollar values they should earn the following season, first the PCA value...second the HBT-style value
Beltran (FA): 22.4 mil / 40.5 mil
Edmonds (FA): 22.4 / 43.6
Jones (FA): 17.3 / 28.6
Rowand (FA): 16.3 / 29.5
Kotsay (FA): 15.4 / 25.1
Cameron (FA): 14.2 / 23.3
Damon (FA): 13.9 / 19.8
Pierre (AR): 6.8 / 8.2
Finley (FA): 13.0 / 16.3
Wells (AR): 6.2 / 8.4
Winn (FA): 12.2 / 16.7
Hunter (FA): 12.2 / 17.2
Patterson (AR): 5.5 / 5.3
Podsednik (CC): 3.3 / 2.5
Baldelli (CC): 3.1 / 3.2
Chavez (AR): 4.6 / 3.8
Payton (AR): 4.3 / 4.4
Grissom (FA): 7.2 / 3.1
Redman (CC): 2.3 / 0.9
Nix (CC): 1.5 / 1.2
I'm concerned that my offensive .357 margin estimates based on PCA don't line up with the runs above replacement values you'll find at like baseballprospectus or THT. I'm not sure why.
PCA, for example, gives Carlos Beltran 12.4 offensive wins in 2004...BPro gives Beltre 60 runs above margin. That don't add up. That's why the dollar values are way too high for the top players. The math is internally correct within PCA (i.e. the win values line up correctly at the league and team level). It's not a system flaw with PCA...there must some other difference that's throwing things off...
Let's try to figure out why the difference in scales exists here...
The average batter is worth 4.5 offensive wins per 700 PA by PCA and gets to that value by creating runs at the league average rate (modern era would be about 4.8 RC/27 Outs or about 480 Outs * 4.8/27 = 85 runs
PCA's offensive margin is a strict 0.250 W% which is found on the run scale by using PythagenPat. Margin^X / (Margin^X + 4.8^X) = .250, solve for Margin. X i= 9.6^0.285 or 1.91
The algebra reduces to:
Margin = 4.8 * (.25/.75)^(1/X) which is 2.7 RC/27 Outs which is 48 RC
This, BTW, is what I mean about a win NOT being equal to 10 runs in a real team framework. Notice that an average player is worth 4.5 wins but only 36 runs (8 runs per win). I just showed you the math to justify average being 84 runs and the margin being 48...the math to justify 4.5 wins being the average for a line-up slot is simple...4.5 * 9 = 40.5...half the possible value an offense can create in a linear system.
Now...PCA is claiming Beltran was worth 12.4 offensive wins or 99 marginal runs.
If we raise the margin to .350, the the new floor becomes 61 runs...meaning an average player would be worth 23 runs...down 13 from previous estimates. That translates to 1.6 lost wins.
Drop 13 runs and PCA is claiming Beltran was worth 86 runs...not 60 as in BPro. 86 marginal runs would indeed be the 10.8 wins or so that I have indicated in my chart above. So the problem is not with my estimate of the new PCA margin that supposedly matches the BPro margin...the problem is that PCA sees the top players as decidedly more valuable than BPro (and I suspect Hardball Times) do.
PCA has a larger spread than most current metrics despite the lower baseline for the margin. That's the difference. Which means the 4.4 mil/AR figure can't be used with PCA estimates of .357 marginal wins (because that 4.4 mil figure was based on lower estimates of win value for the best players)
Oooh...I just realized something else...
The HBT "wins above bench" thing that created the 4.4 mil/WAR I've been using in money values...the bench margin is even higher than the replacement level margin at least traditionally. And those are the kinds of numbers typically seen on the BOOK blog when people start comparing value of free agents (or over at Cameron's site).
BTW...Baseball Prospectus numbers are way off. Hardball Times credits Beltran with 123 runs created, Prospectus credits him with 102 RC in 2004. PCA credits him with an estimated 137 RC (PCA puts more weight on the base stealers attempting to more fairly distribute baserunning bases that don't get reported to the fast guys). So PCA's scale is not off...BPro's scale is off.
If you use THT runs creaed, then Beltran would be worth about 12.1 net wins with my method and 7.6 WAR with the THT/fangraphs/Cameron/BOOK method. Which would make him worth something like 20 million by PCA and (roughly 6 wins above bench so...) 26.5 wins by the accepted method.
Sorry to be picky here but
Import the text data into a spreadsheet
save as .csv
close
open up with a space delimiter
merge the player name cells
save as html
cope the html table code
and voila a more readable table.....
Player
EqG
PA
Off-M
Def-M
Win-M
Off-A
Def-A
Win-A
Carlos Beltran
162
708
12.4
1.5
13.9
10.6
-1.4
9.2
Jim Edmonds
143
612
11.8
2.1
13.9
10.2
-0.3
9.9
Andruw Jones
150
646
4.3
6.5
10.8
2.6
3.9
6.5
Aaron Rowand
121
534
8.3
1.9
10.2
6.9
-0.2
6.7
Mark Kotsay
137
673
6.2
3.4
9.6
4.5
1.2
5.7
Mike Cameron
134
562
4.7
4.2
8.9
3.3
2
5.3
Johnny Damon
140
702
7.3
1.4
8.7
5.5
-1
4.5
Juan Pierre
162
748
7.2
1.3
8.5
5.3
-1.4
3.9
Steve Finley
155
706
6.1
2
8.1
4.3
-0.6
3.7
Vernon Wells
129
590
4.7
3
7.7
3.2
0.8
4
Randy Winn
117
703
5.3
2.3
7.6
3.5
0.3
3.8
Torii Hunter
122
569
3.1
4.5
7.6
1.6
2.3
3.9
Corey Patterson
153
687
4.2
2.7
6.9
2.4
0.1
2.5
Scott Podsednik
149
712
4
2.5
6.5
2.2
-0.1
2.1
Rocco Baldelli
122
565
4.1
2.1
6.2
2.6
0.1
2.7
Endy Chavez
120
547
3.4
2.3
5.7
2
-0.2
1.8
Jay Payton
120
511
2.7
2.7
5.4
1.4
0.7
2.1
Marquis Grissom
133
606
2.6
1.9
4.5
1
-0.3
0.7
Tike Redman
140
581
3.2
1.3
4.5
1.7
-1
0.7
Laynce Nix
91
400
0.3
2.6
2.9
-0.7
1
0.3
MARGIN
**
**
0.25
0.35
**
0.36
0.5
**
Player
EqG
PA
Off-M
Def-M
Win-M
Off-A
Def-A
Win-A
Carlos Beltran
162
708
12.4
1.5
13.9
10.6
-1.4
9.2
Jim Edmonds
143
612
11.8
2.1
13.9
10.2
-0.3
9.9
Andruw Jones
150
646
4.3
6.5
10.8
2.6
3.9
6.5
Aaron Rowand
121
534
8.3
1.9
10.2
6.9
-0.2
6.7
Mark Kotsay
137
673
6.2
3.4
9.6
4.5
1.2
5.7
Mike Cameron
134
562
4.7
4.2
8.9
3.3
2
5.3
Johnny Damon
140
702
7.3
1.4
8.7
5.5
-1
4.5
Juan Pierre
162
748
7.2
1.3
8.5
5.3
-1.4
3.9
Steve Finley
155
706
6.1
2
8.1
4.3
-0.6
3.7
Vernon Wells
129
590
4.7
3
7.7
3.2
0.8
4
Randy Winn
117
703
5.3
2.3
7.6
3.5
0.3
3.8
Torii Hunter
122
569
3.1
4.5
7.6
1.6
2.3
3.9
Corey Patterson
153
687
4.2
2.7
6.9
2.4
0.1
2.5
Scott Podsednik
149
712
4
2.5
6.5
2.2
-0.1
2.1
Rocco Baldelli
122
565
4.1
2.1
6.2
2.6
0.1
2.7
Endy Chavez
120
547
3.4
2.3
5.7
2
-0.2
1.8
Jay Payton
120
511
2.7
2.7
5.4
1.4
0.7
2.1
Marquis Grissom
133
606
2.6
1.9
4.5
1
-0.3
0.7
Tike Redman
140
581
3.2
1.3
4.5
1.7
-1
0.7
Laynce Nix
91
400
0.3
2.6
2.9
-0.7
1
0.3
MARGIN
**
**
0.25
0.35
**
0.36
0.5
**
ok nm something wrong with the forum code.. opens fine in my browser...