Add new comment

What Are Sabermetrics?

 ....................

No disrespect, I just don't think we can completely figure out baseball based off of sabermetrics alone.

Jack Zduriencik does not primarily use Baseball Prospectus or Fangraphs to decide whether Shawn Kelley is coming north.  Statistics are backwards-looking!  What Zduriencik needs is a reliever who will get outs in the 2012 season.

Most people would understand "sabermetrics," as Andrew uses it, to mean "the statistics you can find on Baseball Prospectus and Fangraphs."  So, the comment would bring derision in some quarters.  But let's remember that Jack Zduriencik and Eric Wedge are in profound agreement with this.  Neither do they think that you can make roster decisions based on 2009-11 statistics alone.  

.........

It is not because we fail to understand WAR that we, at SSI, refuse to limit ourselves to its implications.  Tony Blengino his ownself comes, as the years go by, to see himself more and more as a scout/saber dual class.  Most of the stats-based, "pure sabermetrician" types who get hired, wind up blossoming into stats/scouting blended analysts.  Every GM who ever lived --- > used methods that transcend WAR.

My own approach is based on organizing our thinking about any given baseball question -- directing our attention to the right questions.  For example:  what are the causes of Felix' lack of velocity, and if he does lose velocity, what does that mean?  What patterns are there here, and what precedents apply to the problem?  

It's a chess paradigm.  Whatever position you've got, it's been played before.  Felix Hernandez' velocity loss has been seen before, in Pedro Martinez and in other great pitchers.  Find the games that have looked like this before, and review them from the standpoint of the masters who won those games.

We try to begin, sincerely, with questions, as opposed to beginning with what are in essence position statements.  If we find a good clear question, it is not so hard to find historical patterns that illuminate the questions.  

............

By the way, WAR is not synonymous with Sabermetrics.  Neither are statistics.  Remember, you can join Bill James Online for $3 a month, and exploit his pattern recognition as if he were an Ask.com webstie.  In 2008, Bill James was asked to define sabermetrics for a group of risk analysts.  We break pattern to include a good portion of his remarks, so that the essence of his definition won't be lost.

...............

BILL JAMES, 2008

Good afternoon, and thank you for inviting me to speak to you for a few minutes here.    I have been asked to say a few words about sabermetrics, about what I do for the Red Sox, and about how this might relate to risk management, if it does.

            Sabermetrics is descended from traditional sportswriting.   Sportswriting consists of two types of things—reporting, and analysis.  Sabermetrics came from that part of sportswriting which consists of analysis, argument, evaluation, opinion and baloney.   I can tell you very precisely when and how we parted ways with traditional analysis.

            Sportswriters discuss a range of questions which are much the same from generation to generation.   Who is the Most Valuable Player?   Who should go into the Hall of Fame?   Who will win the pennant?   What factors are important in winning the pennant?   If Boston won the pennant, why did they win it?   If Kansas City finished last, why did they finish last?   How has baseball changed over the last few years?   Who is the best third baseman in baseball today?   Who is better, Mike Lowell or Eric Chavez?

            The questions that we deal with in our work are the same as the questions that are discussed by sports columnists and by radio talk show hosts every day.   To the best of my knowledge, there is no difference whatsoever in the underlying issues that we discuss.   The difference between us is very simple.  Sportswriters always or almost always begin their analysis with a position on the issue.   We always begin our analysis with the question itself. 

            If you find a sportswriter debating who should be the National League’s Most Valuable Player this season, his article will probably begin by asserting a position on the issue, and then will argue for that position.   If you find 100 articles by sportswriters debating issues of this type, in all likelihood all 100 articles will do this.  

            What we do is simply to begin by asking “Who is the National League’s Most Valuable Player this season?” rather than to begin by stating that “Albert Pujols is the National League’s Most Valuable Player this season, and let me tell you why.”  That’s all.   That is the entire difference between sabermetrics and traditional sportswriting.  It isn’t the use of statistics.   It isn’t the use of formulas.   It is merely the habit of beginning with a question, rather than beginning with an answer.

            From this very small difference, profound changes arise.   A person who begins by asserting a position on an issue naturally focuses on what he KNOWS.   “What facts do I know, he asks himself, which will help me to explain why Mike Lowell is better than Eric Chavez?”

            The person who begins with the question itself. ...who is better, Mike Lowell or Eric Chavez. . ..the person who begins with the question itself naturally focuses not on what he DOES know, but on what he does NOT know.   I misstated that just a little bit.   The person who begins with the question itself naturally focuses first on what he NEEDS to know.    As soon as he begins to think through what he needs to know, however, it will become apparent that he does not know many of the things that he needs to know to really answer the question.

            Every large general question in baseball can be broken down into smaller and more specific questions, which can be broken down into yet smaller and yet more specific questions, which can be broken down into yet smaller and yet more specific questions.   Eventually you reach the level at which the questions have small and definite answers, even though you may not know what those answers are.  The question of who is better, Mike Lowell or Eric Chavez, for example, leads to the question “What are the elements of a third baseman’s job?”   Well, there is hitting, fielding, baserunning, and off-the-field contributions if you want to count those.  

            The question of who is a better fielder breaks down into 20 other questions.   Who is quicker?  Who is more reliable?  Who makes more mistakes?  Who has a better arm?  Who is better at anticipating the play, and positioning himself correctly?   Who is better at going to his left, or to his right?  Who is better at applying the tag if he needs to make a tag on a runner?

            At that level, the questions that we ask ourselves are actually very much like the questions that scouts ask.   Scouts try to answer them by their expert judgment.   We are not experts, and so we try to find ways to answer them that do not rely on our judgment.  In that way, sabermetrics forms a kind of transition between journalism, from which it descended, and scouting.   In any case, the question of who is better at applying the tag if he needs to make a tag play leads to the question “How many times a year does a third baseman have to make a tag play?”    That is a question that has an actual answer.   I don’t know what the answer is, but there is an answer to that question.   In a few years, somebody will figure out a way to get that answer. 

            The question of who is better, Mike Lowell or Eric Chavez, contains within it probably several hundred smaller questions.   The person who begins by asking that question naturally realizes that he does not know the answers to most of those questions.   He then proceeds inevitably to ask, “How can I find the best answer to that question?” 

            When a person begins with the question itself he inevitably winds up confronting his own ignorance, and trying to find ways to fill in the gaps in his knowledge.   The person who begins with a position on the issue never sees his own ignorance, and, in fact, deliberately avoids seeing his own ignorance.  The person who begins with a position on the issue and argues for that position naturally tries to hide his ignorance of the other internal issues, since the things that he doesn’t know are a weakness in his argument.   The person who begins with the question itself, on the other hand, inevitably winds up reveling in his own ignorance, celebrating his ignorance, and sharing it freely with the world at large.  

            But the person who begins with a position on the issue, by this process, becomes a borrower from the Bank of Knowledge.   He borrows from the things that he knows, and uses them to construct an argument.

            The person who begins with the issue itself, on the other hand, eventually becomes a contributor to the Bank of Knowledge.   Forced to confront his own ignorance, he is forced to find ways to figure out the information that he is missing—ways to count things that haven’t been counted, or ways to estimate the parameters of things that are unknown.    Through this process, he winds up knowing things that were not known before.  

            This is essentially what we do:  We try to construct knowledge to fill in some of the spaces in our massive ignorance.   We are not people who know things.   We are people who are honest enough to admit that we don’t understand things, and frankly, we don’t believe that you understand them, either.  

            Because we have been involved in the effort to figure out the things that we don’t understand for a long time now, we have developed an inventory of a few hundred standard methods that can be used to analyze a new question.  

        .....

When our GM or one of his associates asks me a question, I almost never give him an immediate answer—nor does he expect me to.   If he wanted an immediate answer, he would ask an expert, or he would ask a scout.   What I always say, when asked anything, is “I’ll study the issue.”

            I shouldn’t say “never”.   Occasionally I’ll slip up and give an immediate answer.   A year ago, for example, Theo asked me “How confident are you that Dustin Pedroia will hit what your projections show that he will hit?”   Without thinking about it, I said “Absolutely 100%.”  I knew three things that caused that to slip out.   First, I knew that our hitting projections for minor leaguers have a very high degree of reliability—much higher than most people assume that they do.   Second, I knew that our projections for Dustin Pedroia were actually very conservative, that he was actually a better hitter than we were saying that he was.  Third, I knew that, even though Pedroia had hit .191 as a late-season callup in 2006, he had made contact with more than 90% of his swings.  If you swing as hard as he does and you make contact with every swing, how can you possibly fail to hit?

            That’s risk management, of course. . . .knowing how reliable the projections are.   But that’s also very much the exception to the rule, actually—the exception to two rules.   The first rule is:  We are not experts.   We are not people who “know” things about baseball players.   We are people who know how to study things about baseball players.   When Theo asks me a question about a player or about the game in general, he is not asking me for what I know.   He is asking me to study the issue. 

            And the other rule is, we are never certain.  I said that I was 100% certain that Pedroia would hit in the majors, but in reality, we are never 100% certain of anything.   The talk show hosts are 100% certain.  The sports columnists are 100% certain.   We are just doing the best we can.   Our methods are always flawed, and our answers are usually tenative and muddled.  

            But the difference between knowledge and [baloney] is that knowledge moves forward, whereas [baloney] moves in circles.   When I develop a new method to analyze a baseball question, someone will point out the flaws in that method, and then someone else will suggest an entirely different method to approach the problem, and then somebody else will point out the flaws in that entirely different method, and then somebody else will figure out a way to combine the best features of both methods into one method that is better than either one.  We wind up with methods that get better over time. 

           ....

            We are no more statisticians than we are historians, or scouts, or accountants, or computer programmers.   I suspect that everything we do is much the same as what many of you do.  We look to the past, and we try to organize the things we have seen so that they make some sense.   We ask ourselves “how many of those were there?” and “how many of those others were there?” and “How many of them ended well?” and “How many of them ended badly?”, just as I would imagine most of you do.  

            Often people wonder if we will run out of things to study, and what I always say is that there will never be a shortage of ignorance.   I have realized recently that some people take this wrong, and that when I say that there will never be a shortage of ignorance people think that I am referring to some other group of people, some “ignorant” group of people, when in reality what I am talking about is myself.   I will never understand baseball; I will never understand 1% of what I need to understand.   My view of our work is that we are attacking a mountain range of ignorance with a spoon and a used toothbrush.   The things that we do not know are inexhaustible.  

- Bill James, Article 671 BJOL

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd><p><br>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

shout_filter

  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.