Hi everyone
As you know I performed the Draft Box Score analysis recently. For a summary and discussion of what I did, see this thread
(256102.1)The purpose was essentially to find out if there was a correlation between box score of a draftee and their final Player stats, regardless of how good or bad they were.
A big big big thanks to everyone whom contributed. This was a group effort from a lot of people in the Australian community and I truly thank you for giving me something to do over the last month haha
Also ive got a raw set of data that im going to publish somewhere and let everyone else play with. I will keep you notified of where this ends up. Also note that at this stage, im not taking in anymore data. im finished with this activity for this season.
So Ive got some results. The format of the results will be as follows:
1) A bit of a blog of all the success/problems that I encountered, and why some data was excluded/included.
2) Initial counts/box score averages, player stat averages, etc etc. Basically just some basic stats that don't delve into correlation
3) A simple Salary-Star correlation result.
4) Then I will present a matrix of the box score results vs Player stats results, as a Correlation Coefficient matrix (for those of you who don't understand Correlation Coefficient, back to year 10 maths for you!)
Essentially, things that have a high positive correlation have a value close to 1.0 and those that have no correlation tend to be close to 0
For example - The correlation between Kim Kardashian and men reacting with 'hooly dooly' is roughly a value of 0.9
Conversely, the correlation between Tony Abbot's Liberal government and sound economic policy, is around the 0.1 value.
5) There is a further note about the correlation values which ive made in the correlation matrix page.
6) Then I do a bit of interpretation. This is simply the things ive noticed with the data and really, are discussion starters.
7) Then I flattened the box score data by converting it into box score per 36 min values. (ie: divide alue by minutes played, times by 36). That way, we get more representative data and should help with the data providing accurate results.
Anyways that’s it! Hope you all get something out of this. Ive tried to be as 'neutral' as possible with my presentation. The correlation scores are the most 'interpretive' parts of this analysis, and everyone should really make up their own mind about this.
Please bare with me as a I complete this thread - there will be quite a few posts coming in the next hour or so.