So I am trying to find out which batting instances contribute the most to creating runs. Right now I am playing around with it just to see what kind of correlation coefficient I get. Which tries to show how much an instance contributes to a higher ERA, which I am using ERA because it is the best way for me to measure runs minus any errors. Maybe I will change that method as well as time goes on and I play with this more.
But here is my very first correlation using batting average per year and earned run average per year, league wide. Surprisingly batting average only has a correlation of .621, which typically a strong correlation wants to be near .90 or higher to be significant.
Now when you plot batting average and earned runs on a graph, with the data normalized they do appear to have a connection, just not significant enough.
The orange line is batting average and blue line is ERA. Because I am new with Excel graphs and with MSPaint features, this graph looks pretty anemic, and I promise as I do this more often my skills will improve in that regard, but you can get the general idea here.
My intent after running this through several different statistical categories like batting average, OBA, OPS, Slugging, etc… Is to find out which instances have the highest correlations to earned runs and to add a weighted number to each instance and come up with a new stat! We will see how this goes. Also I am taking ideas on a name for this stat, please leave any suggestions via comments or email.