Julien's Faster than Light Blog

Jump to top
Monday, March 08, 2004
 
the theory
the baseball pod project
the algorithm
the simulator
the database

we don't have every pitch of every game, which makes it difficult to do research on pitch selection, wildness, etc. but we do have a record of every ball hit into fair territory. hence goeth our investigations.

what does that mean? here at the baseball pod project we identify three main skill areas, represented by our favorite stats wal, con, and pow. but these stats are merely approximations of our final model. none of them will actually be used. for now, we're stuck with wal and con, but pow is imminently break-down-able.

the sons of pow are hrp, xbp, and 1bp.

we thus have six statistical measures to sink our teeth into. the first thing to do is establish exactly what the representative sample sizes are. we have a good sense of them, having swum in the data, but it's time to be precise.

the next thing is park effects. yes, there are various park effects out there, but to us they are useless. we need to predict what every player will do in every park. ie, how much does each park affect each skill. then we can have park-neutral right-now profiles (pnrnp's). we'll be park-neutral right-now profile pimps!

rnp's are still a long way off. after park effects, we want to find the peak age for each skill. and we want to see how it differs from player to player, for prediction purposes. the theory that a player peaks at 28 is mostly true, but some players improve their skills well into their 30's. breaking things down will give us more information.

then we perfect the current system. we use hrp to predict xbp. there will be a formula. then, every player can be examined to see how much above or below the predicted value he is. this analysis will help us deal with less-than-optimal sample sizes. there will be a new stat: extra-base percentage above expected (xae).

similarly, we can use hrp and xae to predict 1bp. single percentage above expected (1ae) will be a good measure of speed to first.

is that enough for you, adam?

once we get parks, peaks, and sample sizes nailed down, we will be well on our way toward predicting the pretty career paths that are so many smooth bell curves in our mind's eye.

in reality, of course, there are gorges, chasms, and cliffs.

what a bummer. to end on an optimistic note, let us remember the occasional player who begins as a mortal, then sails off into the heavens, taking his place among the gods.
Comments: Post a Comment

Powered by Blogger