Seattle Sports Insider

Player Families - Two Computer Systems and One Human System

Posted by

11/29/11

..........

=== Bill James Similarity Score at B-Ref.com ===

Back in about 1977, Bill James began serious work on player sorting.

Which criteria did he use? Well, he used K's, BB's, and power, of course. But he added player age, position, runs scored-triples-SB's, (reflecting speed), and a bunch of stuff.

James' "Score" systems weren't pulled out of his ear. He used trial-and-error, in many many MANY iterations, to triangulate a decent similarity. This represented a step forward -- we went forward from the 1970's to the 1980's with these Similarity Scores.

.....

I kind of hate to point out that now, in Seattle, we have discovered a 1975 version of player comp'ing. It is a little odd to see it greeted as revolutionary, rather than retro. ;- ) Let's credit Bill, who was using a more advanced system, in 1977.

......

In 1985, James refined Similarity Scores by introducing, in a rudimentary form, his idea for HOF Pitcher Families.

=== Baseball HQ ===

In the late 1980's, Ron Shandler kicked the "Pitcher Family" can two blocks forward with his PQS system.

For a pitcher, he analyzed (1) K rate, (2) BB rate, (3) K/BB ratio, (4) HR rate, and (5) Hit avoidance. For hitters he had about five things, too. He looked at these 5 moving parts as they trended across a series of years.

25 years on, Ron uses 15 criteria, not 5. For batters, he uses CT%, BB%*, and ISO, as does the Seattle blog-o-sphere, but adds SX (speed index), GB/FB ratio, BABIP profile, and many other things. Through long practice, Ron has learned when a PX curve is about to spike, given a certain EYE. And so on.

=== PECOTA ===

Around 2002, Baseball Prospectus (Nate Silver) developed its own system for comp'ing players.

As you know by now, the guts of any "Hitter Family" system are going to be the variables that are selected by the programmer. Silver selected K, BB, and ISO, of course, but added in AVG, SB's ...

.... and, what I admired greatly, "Phenotypical attributes." In other words, Dustin Ackley doesn't compare well to Adam Dunn in part because he's 100 lbs. lighter.

BP continued to tweak its attributes in an effort to produce ever-greater precision. It uses 3-year sets to compare players, not 1-year sets. Why? Because a player will not compare to himself if you cut it so fine that you're down to one year.

But you got a real easy test here. If Mike Carp ain't coming up as comparable to Mike Carp, then none of the other comp-pairs are real solid, either. :- )

Seattle Sports Insider

Posted by

Add comment

Filtered HTML

Plain text

shout_filter