The other day, I got a strange ominous feeling. Either I had left the stove on, or something was up with our beloved baseball team. After a thorough investigation of my kitchen, no appliances had been left on, meaning it was time to figure out what was bothering me about the Tribe.
Sports are exciting because you never know what is going to happen. How many runs are we going to score today? It is impossible to tell. Lately, I've gotten this strange feeling that I know less and less what to expect out of how many runs we will score. I mean, I know our pitching gave up back to back 13 run games to the Orioles, but that felt like an outlier. It is the offense that I felt was completely unpredictable day to day.
Enter in Statistics. The medicine to the madness I am feeling. It was time to figure out, is there any abnormality in the variance of runs we score on a day to day basis.
I embarked on a little project to create some glorious spreadsheets to measure the variance in the runs we score. I know you are all smart people, so I won't waste your time explaining these things. I mean it would be a little demeaning to your.....
THAT'S YOUR #1 RANKED FINGER GUN?!?! ARE YOU BLIND?!?! HE'S FIRING SOFTLY INTO THE AIR. CAN YOU EVEN CALL THOSE PEWS?! THEY LOOK AT BEST LIKE A SOFT SIZZLE...
God dammit, just....shut up and let me tell you about the spreadsheets
I decided it was best to start with 2019. I hopped onto FanGraphs, pulled the scores from each matchup, and plugged it in. To compare if we aren't as consistent in our runs scored, I needed to get a league average. I added all AL teams and their runs scored as well. Spice the thing up with a few graphs, pull some numbers, crunch some things, and BOOM! SPREADSHEET (each sheet from 2016-2019 listed at the bottom)
I probably should have warned you before you opened the spreadsheet. It isn't as comforting as I had hoped it would be. For anybody who wants a breakdown of what is in the spreadsheet and what I measured, here it is:
After compiling the runs scored by each team, I took your normal stats like average, mode, median. Your general smorgasbord of elementary statistics. Then, I inserted a scatter plot in each teams individual page. The Indians scatter plot for 2019* looks like this:
In order to make all the graphs fit neatly, without greatly compromising the statistics, I took any game where a team scored over 18 runs and just turned it into 18 runs. Trust me, it looks prettier. I added trend lines, just to give an idea of what the averages are (a fun fact about it is through any scatter plot I added and looked at, this is the only plot I could find with this steep of an upward trend).
After looking through the scatter plots, I realized that while they were generally helpful, they didn't numerically quantify how much variance there was in the amount of runs scored. So, I compiled all the teams data onto a master sheet, and toke the standard deviation of each team. For anybody unfamiliar with what standard deviation is, it is basically a measurement for the average distance between a data point (how many runs were score in one particular game) and the average of the data set (the average amount of runs we score in all our games).
I wasn't done yet. The standard deviation is smaller for teams with lower average, simply because if you are only scoring between 0-4 runs a game on average, there's going to be less of a variance than if you are scoring between 0-8 runs a game. To get a sense of how much variance there is in your scoring, while weighting the effect of the how much you score, I created a modifier by taking the square root of the ratio of a teams average runs scored vs the league average. I'm about 80% certain on my math here, so feel free to correct me if I'm way off.
Taking the standard deviation with the modifier, our teams ranks 13th out of 15th for consistency in the amount of runs we score. In order to better visualize the consistency of the amounts of runs scored, I made more graphs. For standard deviation measurements, it is common to look at bell graphs. I created asymmetrical bell graphs for this by mapping the percentage of games that end with each amount of runs scored (i.e. league average this year is 5.06% of games have the offense scoring no runs, 8.85% scoring 1 run, etc.).
For the amount of runs scored, we get a asymettrical graph with the hump at 4 runs as the league average. Teams with higher averages in terms of runs scored (Min, NYY), should have higher and more frequent humps to the right side of league average. Teams with lower averages should have more to the left side of league average. Here's Cleveland graph for 2019:
Here's the Twins:
Here's the Tigers:
Why does this all matter? Honestly, it doesn't (as far as I know). You heard me, I spent all this time talking about spreadsheets that don't matter. If you average 4 runs per game, but you always either score 1 run or 7 runs, you should win the same amount of games as somebody who always scores 4 runs (with a large enough sample size). That's why we look at run differential and average runs scored. How inconsistent our offense is really doesn't matter, because with a large enough sample size we'll win the same amount of games by inconsistently scoring 2 or 8 runs as we do consistently scoring 5.
But as a fan, I find this annoying. On any given night, you don't know if you're going to get a team full of Michael Martinez's. 30% of our games this year have been of us scoring 2 runs or less 30%!!!!!! For reference, the Twins only have that about 15% of the time. We're rarely actually scoring 3 or 4 runs, which is baffling. We're equally likely to score 0 or 1 run as we are to score 7 or 8 runs, according to previous trends. And I know some of this might be because our line-up in the first half was much like the line-up from Major League's first-half, but even now it feels like we're either going to pound the snot out of a team of go out with a whimper.
I don't know why this is happening, and this doesn't mean much without looking at it compared to our pitching trends (maybe I'll run the numbers on that some other time). Still, it's nice to know that my feeling about the tribe being inconsistent is correct. If you've felt that we're a different team offensively on any given night, you're not alone, and there are some numbers to back up that case.
*2019 runs scored include games up thru Wednesday, August 12. I'll probably update them again at the end of the season, but for I'm too lazy to make this something I update consistently.
AL Runs Scored Breakdown Spreadsheets (2016-2019):