Care for Some Advanced SABR?
Sorry to clutter up yer front page with a bit of unpleasant bidness. :) The good Doc recommended that I give a little stub post discussing my bid to set up some top-notch analytics for the new detectovision family of blogs (blighting the internet soon with an avalanche of commentary from a collection of the demented minds from our little den, including yours truly, once I'm ready). I have been discussing some of my plans for deep level analysis here over the recent weeks, so you'll mostly be familiar with my plan to adapt the Elo rating system to baseball match-ups and rate every player. That is just one example of the sort of sequential/contextual analysis I'd like to do with the goal of building better predictive models (who knows, maybe I'll start beating Vegas, if I keep at it long enough - wink, wink) that enlighten us further about which pitcher/batter match-ups are likely to end in success or failure, which minor leaguers are likely to succeed at a higher level, which teams are better suited to post-season play, etc. Let me list it all out for ya in bullets:
- Elo Ratings
I believe that this is more useful than we currently realize for a number of crucial baseball decisions. I think Elo ratings computed based on match-ups have the power to tell us which teams are built for the post-season, which prospects are most likely to succeed at the next level, and which slumps and streaks may be influenced by the strength of competition faced.
- Weather Effects
Thus far, all of the analysis done on the impacts of weather on baseball games have been linear and not accounting for confounding contextual variables. I aim to change that.
- Pitch Sequencing and Pitchability
Some pitchers get more out of their stuff than others even if all of the data on the pitches is identical (location, speed, speed differential, movement, count). We know that Felix Hernandez is one of those people from having watched him throw. I believe a sequential method, not best-fit linear approach or aggregate binned analysis, is the best way to discover which pitch sequences and pitchers are more effective and who has "pitchability" above and beyond the predictive metrics.
- Historical Comparisons and Similarity-based projections
This has long been a pet project concept of mine, but now I aim to get it done.
Here's the catch - I'm not currently in possession of much capital. I hate to bug the room for an assist, but it would greatly expedite my work and your infotainment if you could send a few tips my way to fund the acquisition of a personal-use MATLAB license and a couple of their toolboxes. There are open-source ways to do the same work, and if it comes to it...I'll muddle through, but it'll take me a lot longer that way and delay your chance to see the results. :)
So if you're interested - go here: gofund.me/p6jgks and pledge your support. I'll try to make it worth your while. If you pledge $10, I'll comp you at my premium detectovision sub-blog for three months (I'll be charging $5 / month once I launch). If you pledge $20, I'll comp you for 6 months, and if you pledge $40, I'll comp you for a year. Along with the statistical analysis, I'll also be offering discussions on baseball players from history that may have been forgotten but should be remembered, insights from my time in a big league front office, and some game to game commentary from a more statistical perspective. I hope to keep the content well worth the viewing. If you're interested - I'd appreciate any help you can give!
Your geeky friend,
SABR Matt