I’m in a predictive modeling class for my grad program at NU, and we are learning a statistical programming language called SAS. One of the things we are trying early on is cluster analysis to determine if variables are related. I decided to play around with data that’s a little more interesting than housing prices. Charlie Morton has been on of my favorite pitchers to watch pitch. His curveball is just sexy. Cluster analysis can help us separate Morton’s pitches into different pitch types using PitchFX data I’ve been scraping.
I’ve plotted two charts, one is the vertical movement vs. the release speed. The second is the vertical movement vs the horizontal movement. [The movement parameters are calculated from the deviation of the ball from a straight path with no spin. And the horizontal movement is from the perspective of the catcher/batter. So imagine that Morton is throwing toward you.] So fastballs with backspin will have a positive vertical movement. Curveballs with top spin will have negative vertical movement. I used SAS to look at the speed, vertical, and horizontal movement and cluster similar pitches together. Without much tweaking, I was able to identify Morton’s fastballs and curveballs. He also has a third group which is a splitter according to brooksbaseball.net
Morton is famous for his sinker, which is a two-seam fastball that ‘sinks’ relative to a four-seam fastball thrown at the same angle. I’ve annotated the sinker on the vertical movement to release speed chart below. Morton’s sinker is hard to differentiate because it’s almost as fast as his four-seamer. (low-90s) It doesn’t stay as high due to the different spin compared to the four-seam fastball. The advantage here is that a batter will swing as to hit the four-seam fastball, but the sinker will be an inch or two lower than what the batter adjusted for. Since the bat is round, the ball will come off the bat at a low angle, and bam! Ground ball.
Brooksbaseball.net has updated and historical PitchFX data presented very nicely. I suggest checking them out if you want to see visualizations like this for other games or pitchers. Their visualization tools are easy to use and updated right after games end.