Kell-o-Vision Translation
That's what he said, Dept.

.

Don't take this as condescending.  Actually it's the opposite of that.  The antonym here is ... hm.  Deferential?  Respectful?  Friendly?

Dr. D made a penny or three, back in the day, by translating the works of Subject Matter Experts (SME's) for the benefit of executives.  These guys did not differ from the SME's in their IQ's -- most of them were former SME's themselves.  But they came from different fields.

SME's use a fluent language when they talk to each other.  It wouldn't be smart for Jeff Sullivan to use the same words when talking to Logan Davis about baseball, as he uses when he talks to his mom about it.  . 

Dr. D's job, when doing the Exec Sums, was to take an idea exchange BETWEEN these experts and to put it onto a sheet that could be easily and pleasantly scanned, in 2-5 minutes, by a busy executive.

It was fun, one of my favorite things to do.  So any excuse is enough for us ...  

At SSI we do not assume that every reader is interested in advanced sabermetrics.  You're not a second-class citizen here if you do not care about the difference between FIP and xFIP.

Dr. K's article was written to the layman, of which Dr. D is one.  But I'm into bullet points tonight, so .... he can correct me where I mangle the lyrics on the cover version of the song.

By the way, if I were Jack Zduriencik, I'd have a whole intern, 6' plus, slotted to precisely this sort of thing.  Gimme those 2-minute summaries of everything on Hardball Times, Baseball Prospectus, Fangraphs, etc, scouring those sources for key ideas.  (No, Dr. D is not applying for such a job; four jobs are plenty, he'd say, but SABRMatt might want to volunteer the idea .... perhaps one of you amigos wants to create a "sample" -- heh -- of a day's work and mail it around to front offices.)

So here's your Exec Sum, three times the length of the original paper, of course.....

Gaffney's article makes a game-changing point.  I'm not sure everybody took the time to grok it.  Pardon us a second flyover?

.

Exec Sum, Dept.

Quoth Dr. K:

1) His background is verrrrry heavy in math and science.  He's not biased against it.  He'd love it, if it worked right.

.

2) But the math and science he sees put out there is --- > "busted."  It is not valid.  It just flat doesn't work, and never will work.  Not as it applies to Peguero-vs-Thames type questions.

.

3) Baseball Prospectus* thinks that any (so-called) shortcomings, in its world view, will be resolved when --- > they get larger "samples," get more numbers on the situation.  

Think that Hardball Times' conclusion on Nick Franklin's callup is a little dubious?  "Well, hey, we just need a few more seasons' worth of MLE studies, and we'll be able to close out the debate."

.

4) Actually, says Dr. K, wider samples (usually) do NOT fix the problems.

They just capture targets that have moved, more and more, while we took the "sample".  Like if you measured a cheetah slowing down from 60, to 50, to 40 ... let's just measure for three more seconds and we'll really have his average speed, right?

.

5) For example, the assumptions on a "replacement level player's" production is "painfully coarse-grained" when seen through the eyes of a research chemist.  

As are the position adjustments -- how much do you lose, moving Montero to DH or Franklin to 2B?  A real scientist needs to know, but Baseball Prospectus* just plugs in a wild number, and moves on.  Selling its formulas as precise, despite these "painful" fudge factors.

Sabermetrics' use of fudge factors wouldn't be such a problem if sabermetrics didn't sell itself as a hard science.  It is not one.

.

6) Many, many key parts of a baseball game can never be measured.  It matters whether Jason Bay hits a home run on Tuesday!  But we have no way to use stats to tell us how likely Bay is to hit 10, or 15, or 20 homers the rest of the year.

.

7) You can't really expect Jack Zduriencik to use stats as his main resource when he decides (whether to bring Nick Franklin up, invite Jason Bay to camp, etc). 

.

Dr's R/X

Now would follow the part where you give the sr. mgm't your own 1-paragraph take on how accurate/important the info is, and another 1-paragraph take on what the mg'rs options are.

.............

Sabermetricians would (AND DO) respond to Dr. K by saying, "well, they're the best we have, so until you got better, shut the deuce up."

But this is not a discussion about HOW Jack Zduriencik should use stats to decide between Carlos Peguero and Eric Thames.  This is a discussion about WHETHER he should do so!

Sabermetricians studiously avoid this debate.  The reason is because once it starts, they can't win it.

"We do not have near-perfect measurements of baseball players.  It is foolish to assume that we do." - Bill James

(No offense to the sterling debate on Nick Franklin this week.  This particular version of that debate was even-handed and reasonable.)

..............

Jack Zduriencik has always nodded politely at the local blogs.  He's told them smilingly that he "blends" tools scouting and stats analysis.  He's made sure that he is aware of what the stats guys say, and then he has solved his problems using judgment and intuition.

Which is, by the way, what a chess grandmaster does.

Dr's recommendation would be:  keep doin' what yer doin', GM's.

.............

What would be really, Really cool would be if somebody at Baseball Prospectus, or Hardball Times, or whatever, read Dr. Gaffney's article and reacted appropriately.  

Imagine reading a HBT website on which the authors realized the limitations of their work.  Imagine a saber site on which the authors knew they couldn't predict pennant races, knew they didn't usually have the right answer to a roster question, and then the things they did claim were right 98% of the time.

Oh yeah.  That's BJOL.  Bill almost never claims there is a right answer to a Peguero-vs-Thames situation.  The Pro's and Con's of each option are far, far too complex to capture with mathematics.  You've got 2014 pros and cons.  You've got 2015 pros and cons.  You've got effects-on-other-players pros and cons.  You've got player development pros and cons, and future trade option pros and cons, and ...

Nor does he ever say much at all about whether a team should punt a season, other than to say "If a team does decide to do that, then..."  

Which brings us to another topical question :- )

.

 

 

 

 

............

 

 

Comments

1

Which was most assuredly NOT "shut the deuce up"...it was "yep...we see that problem...everyone working in the big leagues knows that most of the stats on the baseball card have a rare-events problem...so we've moved beyond all of that uber-stat stuff and have started looking for large-sample relationships between the stats we're all familiar with and events in baseball that are not rare."
In other words...
We can't count on XBH rate because XB are rare...and this week's sample may not be relevant to next week's games. That's fine...but we CAN measure the REAL skills using statistics that are not rare. Batted balls are not rare...so measuring how often a player puts the ball in play in fair territory is reliable and a real indicator of skill. We can't count on the HR part of DIPS because HR are rare...but we can measure the average velociy of batted balls off a pitcher...since...again...batted balls are not rare...and get a real-world fast-twitch look at how a picher is performing THIS WEEK that has predictive value. We can't judge defense by UZR...that's OK...we can look at a player's jumps on each chance he gets (again...not rare) and measure the difficulty of eahc play he makes and make better predictions that way.
We shouldn't use OPS because it's heavily biased by rare events like HR? Cool...measure the htiter's BB rate, K rate, batted ball velocity, swing and miss rate, and flyball rate...none of which are rare events.
Do you see what I'm getting at?

2

For your prescription to work, the common event -- say fly ball speed off bat -- would need to be very highly correlated with the rare event -- homeruns. My guess is that to have the correlation be strong, you will need to account for speed and trajectory (or truly the velocity off the bat, which is vector quantity). My guess is once you have a data base full of velocity on fly balls to analyze, I predict that you will need to significantly down select the types of flyball velocities to get a strong correlation to the homerun rate and be back to the same problem. It's a predict, it's not fact, so I could very well be wrong.
Let's consider a fielding F/X for outfielders where I could see it working. We need to know how long it takes the fielder to start moving once contact is made, we need to know how long it takes the fielder to get intersect the flyball, and I guess we would need to know how direct the trajectory of the player to the ball is. But most flyballs are routine, so how fast the fielder moves is dependent on the nature of how hard the ball is to catch. Consequently, the routine plays don't provide useful information. It's only the hard, but potentially makable plays that will provide useful data.
My intuition tells me that common events tell us little about the rare events that drive baseball outcomes. I agree that the number of useful events strongly correlated with the important rare events is larger than the rare events themselves, but not as much larger as you seem to be indicating. It will be fun to find out and I'd be happy to be wrong.

3

Fielders don't start from the same place all the time.  There are defensive shifts put in place, or outfielders play deep or shallow depending on the situation.  So a hit to one area of the field in one inning might be a hit where in another inning it would be an out.  Hard batted balls right at fielders are outs, but soft bloopers that are placed juuust right are hits.
Defense is a nightmare to predict or quantify statistically.
~G

4

I knew my post would only appeal to a narrow subset of SSI readers, but I put it out there anyway. I'm also glad that Cool Papa Bell and SABR Matt don't see it as a long term problem. Diversity of opinion makes for more interesting debate.
Your summary was spot on, but I think you knew that ;)
Now to comment on your prescription. Trained as a scientist, I don't think it is the 'complexity' of the problem that defeats quantitative modeling and prediction, it is the nearly complete lack of control of the critical variables in the problem. The lack of control is so much we aren't even sure which variables are critical. It short, we never do baseball experiments with controls groups, so we don't get to test and refine our models. Do you really need sluggers to complement on-base guys. Would fielding nine Luis Castillo's break the correlation between runs scored and on-base percentage? Would surrounding Luis Castillo with Luis Castillos negatively or positively effect his performance. It would be an easy and insightful experiment, but it will never happen. As my wife said when our first daughter was born, "Baby, not experiment."
In closing, sabermetrics is natural history, not science. I can count how many chickadees eat from my backyard bird feeder and plot the year-by-year trend, but I won't be able to determine if the trend is from global warming or my neighbors installing the new and improved Bird-a-Lux100 and subtly changing the migratory path of chickadees by one backyard.

5

That's a key point that I hadn't been appreciating nearly enough.  I need to think about it.  A lot.
..........
I don't know whether it outweighs the question of the problem's complexity -- perhaps we're talking 80% about the same thing.
For example, if you're trying to predict the number of home runs that Jason Bay is likely to hit this year - the 50th percentile, assuming a total of 450 AB's, suppose I use PECOTA's model and come up with 11 homers.
How do I know that PECOTA is NOTICING all the variables that could affect the outcome?  There's a certain chance that playing in New York deflected his career arc (as Safeco did Beltre's).  Where is that in PECOTA, or in your or my quantitative model?
There's a certain chance that hyper-intelligent, soft-spoken people (like Bay is) see this effect amplified.  Maybe he's a sensitive personality (I think he actually is) and if you normalize for "sensitive people who spiral down in NY" then his chances to hit 20+ homers change.
There's a certain chance that Bay is an AL player, whatever that is.  Did we notice that variable?
The way the variables INTERACT is incredibly complex.  For example, maybe an Omniscient Computer would compare the SINE WAVE of Bay's career EYE to his career ISO and find that, when the sine waves intersect at moments A, B, and C, his chances for an age-34 bounceback are deflected.
............
The reason I mention, is because am familiar with just how tough it is to capture these variables from a chess-algorithm standpoint.  In fact you cannot teach a computer to play chess well, using abstract or general principles; it has to use brute force calculation of all possible outcomes (not available here, obviously).
I'd be very interested in your comments, Kelly.  How do we ever become confident that we have noticed and captured the real-world variables that impact Bay's results? 
..................
I'm taking psychology from one of the country's most accomplished PTSD counselors right now.  Am learning a lot, but am becoming more and more convinced that you can't predict people.... :- )
Obviously, we know some things.  We know that one hundred 34-year-olds, as a group, are a worse bet than a hundred 27-year-olds.  Sigghhh...

6

Make me 'King of America' and give me the option of taking the thousand best baseball players and play out the season as an experiment, and I guess we wouldn't learn all that much of predictive value, but we would learn about the limits of our models and it would be absolutely awesome.
Imagine if you could construct teams to test yours our others endless hypotheses: do you need balanced line-ups (power hitters and table setters), does Safeco really suck the will to live out of all RH pull hitters, would playing three gold glove CF in the outfield suppresses each ones value, and so much more.
The point is not that the performance of people can likely be predicted from a complex numerical model, rather that if you wanted to have a robust numerical model you would insist on testing the validity of the model with highly controlled experiments. Without the experiments, their is no predictive 'roster engineering'. Could you imagine Boeing switching from aluminum wings to carbon nanotube wings without constructing detailed stress-strain and hydrodynamic models, testing the materials and the models in wind tunnel experiments? And they wouldn't do it once, they would fine tune the models and materials, test them in experiment to make sure they were on the right track, and rinse and repeat until they got the performance they needed? And when they finally are confident in their understanding of the new materials, they go and build and fly test planes.
What baseball teams care about is much harder to test, but they don't really try. It reminds me of a meet and greet between new faculty and the provost. I was standing around feeling bored and awkward and asked the fellow beside me what his area of study was. Turns out he was a social scientist and presented the most elegant meringue fluff of a research statement. I like science to describe the stuff that doesn't recoil from a laser and art and sport to describe the rest.

7

Clubs *could* try to control for pitcher injury data, at least a little bit ... you could take 21-year-old pitchers who are LHP's, who throw in James Paxton "families," who pitched in college, K rate, velocity, yada yada yada, and (over time) take 20 of them and test for one variable -- game pitch limits, IP on a season, or (coarsely defined) "pitches thrown when tired".
It would be hard to do retroactively, probably ... you might recall that James took BP's "Pitcher Abuse Points" (itself a very, VERY simple system) and tested it -- finding out that the most "abused" pitchers (# pitches after 100, 110, and 120) were the ones that went on to be the healthiest in later seasons!
Whereupon BP sniffed that James forgot that the "abused" pitchers were the Schillings and Clemenses, and that he forgot to control for skill level ...... 
Despite the fact that they had not noticed the fact that their system was predictively invalid ... 
:- )

8

Actually I was thinking mostly of COMMENTERS in USSM threads who typically take exactly that tack when somebody brings up (say) UZR problems.  "Unless you have a better system, in which case you'll revolutionize the sport, then shut up.  The current methods are the scientific consensus."
You know, as if we were talking about a consensus on the effects of Zoloft on the nervous system.

9

And it does seem like orgs (as opposed to pop bloggers) are focusing more and more on that.  Since James wrote his 2009 article on BIP velocity, SwStr%, Sw%, and Pull rate, I've been focusing on those items myself, thankfully.
But now we have the 30,000-foot question of "how much of baseball can you capture with those tools," and .... 

10

A lot of the advanced stats are "normalized" for "reverting to the mean," but a hitter who hits the ball with authority a lot more often will "revert" to a different "mean" than the guy who is "merely" putting the ball in play.
It will take a long time for that kind of data to be collected at minor-league parks and made available, I suppose, but it ought to allow for a lot sharper focus on major-league guys at least.
Further, I can tell that Dr. K is in the physical sciences.  In the social sciences, you know you'll never have double-blind randomized experimental design (in most cases), so you take what you can.  If you can explain 80% of voting behavior by knowing how folks feel about the economy and whether they like the incumbent president, then you go with that.  Then you focus on what might shift along the margins (turnout, fundraising, gaffes, scandals, etc.).
I think we can do a good job of getting "profile" of a player from statistics, and then figure out what else to look at to see if he might buck the trend.
If someone thinks Carlos Peguero is a strong candidate to avoid what has happened to the vast majority of hitters with his profile, then we can debate that.  But with the stats we can "stipulate" to a lot of ground. 
Just my sense.

Add comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd><p><br>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

shout_filter

  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.