BuzzerBeater Forums

BB Global (English) > Skill cap testing

Skill cap testing

Set priority
Show messages by
This Post:
22
155261.43 in reply to 155261.33
Date: 9/4/2010 4:19:36 AM
Overall Posts Rated:
4040
I think that the idea is that who is active and is searching for the information, will have kind of advantage in comparison with these which just play a game as it is. From the rules, if you are reading carefully, is evident, that they are incomplete for purpose.

This Post:
00
155261.44 in reply to 155261.4
Date: 9/5/2010 4:33:45 AM
Overall Posts Rated:
225225
So far, with 20 data points, I have a model that fits very well. An r-squared of almost .99. It says the following very weird results:

-The BB defined position does not seem to matter for capping purposes.
-The model works much better without an intercept (.99 vs .46 r-squared).
-Jump shot and rebounding matter a lot for skill capping. The rest of the skills have some contribution, but are pretty minor in comparison. Some skills are even so minor they are hardly worth mentioning.

If you just toss the position on the right-hand side of the equation, and with so few data points, it's pretty much a foregone conclusion that the variable will be meaningless. If you would really like to investigate positions, run separate regressions for each position when you get enough data points.

I don't see how having a better fit without a constant is weird. This simply states that a player that has capped at zero skills has zero potential (that's assuming that you keep the potential level on the LHS).

If potential capping is in any way similar to salary, then there is no wonder that some skills return insignificant coefficients. Skill weights are different per position, so you're getting significant coefficients for the position that occurs most often in your small sample.

There is a lot to explore in this, but a lot more data is needed. I'd say 50-100 datapoints per position, in order to be able to read anything from the results.

"I don't know half of you half as well as I should like; and I like less than half of you half as well as you deserve."
This Post:
00
155261.45 in reply to 155261.44
Date: 9/5/2010 2:00:35 PM
Overall Posts Rated:
155155

If you just toss the position on the right-hand side of the equation, and with so few data points, it's pretty much a foregone conclusion that the variable will be meaningless. If you would really like to investigate positions, run separate regressions for each position when you get enough data points.


I did this and it does generate different results. But it still does not prove anything. If there are enough data points by position, the class variable should be significant (ie: add something to the model) or not. Just getting different coefficients by position proves nothing, especially with so little data. I could take a number of variables, significant or not, out of the model (inside d, shot blocking, etc) and it would still change the coefficients.


Skill weights are different per position, so you're getting significant coefficients for the position that occurs most often in your small sample.


Maybe, I don't know. But suffice it to say that I will test everything, although as you said the data is rather limiting at the moment.


There is a lot to explore in this, but a lot more data is needed. I'd say 50-100 datapoints per position, in order to be able to read anything from the results.


Even 50 datapoints would imply at least 250 observations. So even if true, it seems like it will be a long time before this study will finish. There was a lot of excitement when the study started but now the data is only barely trickling in.

So, your concern is noted. And while I appreciate that you have a good deal of knowledge with numbers, what you could help more with is a solution to the possible issue of sub-levels on the cap. For example: how to prove they exist (or not) and if they do, how to model appropriately.

Run of the Mill Canadian Manager
This Post:
22
155261.46 in reply to 155261.45
Date: 9/5/2010 3:37:54 PM
Overall Posts Rated:
225225
So, your concern is noted. And while I appreciate that you have a good deal of knowledge with numbers, what you could help more with is a solution to the possible issue of sub-levels on the cap. For example: how to prove they exist (or not) and if they do, how to model appropriately.

The trick in figuring out potential potential sublevels is making sure you don't confuse them with skill sublevels. My best guess would be using the "current" salary level (or DMI in proficient game shape) as a regressor to control for unobserved sublevels in skills. If the per-position regressions produce significant differences between predicted and observed values of potential (or high error terms), this indicates differences within potential levels.

The problem with this approach is a certain collinearity on the right-hand side, since salary is also a product of skill levels. This may be addressed by running a regressin using just the salary levels (this will test against the assumption that the potential "value" and salary are linearly proportional).

This is mostly brainstorming, and I am sure there are at least a handful of users here who know how to work data, and can toss in their own ideas (Coco?).

All in all, it's not that I have concerns, per se. I do think that it is great that someone is willing to do this type of "reverse-engineering", since it's long overdue. I don't mean to come through as condescending, just trying to provide some pointers about data work, since I do have some experience there.

"I don't know half of you half as well as I should like; and I like less than half of you half as well as you deserve."
This Post:
00
155261.47 in reply to 155261.46
Date: 9/5/2010 8:14:09 PM
Overall Posts Rated:
155155
Actually, thanks to your message I was thinking about it today and realized that you are indeed right. You are right that one variable by position is not the right approach. I actually need 5 indicator variables (one for each position). Or I could just do one model per position, but that leaves no way to test for the significance of position.

As for regressing using current salary as a regressor, this again leaves potential as a y-variable. And in that case, if there are sub-levels on potential (ie: error), it still leads to a biased model.

Run of the Mill Canadian Manager
This Post:
22
155261.49 in reply to 155261.47
Date: 9/6/2010 2:48:03 AM
Overall Posts Rated:
225225
Actually, thanks to your message I was thinking about it today and realized that you are indeed right. You are right that one variable by position is not the right approach. I actually need 5 indicator variables (one for each position). Or I could just do one model per position, but that leaves no way to test for the significance of position.

The standard practice for cathegorical variables is to include (n-1) dummies. In this case, we' need 4 variables (think of them as describing the "offset" from a base category).

This is only appropriate when you believe the position difference is a fixed effect, and the coefficients of other variables are identical. If you have reasons to believe that your categorical variable also affects the coefficients of other variables, you need to add interactions f all your LHS variables with all categorical variables.

You will still need a lot of observation, because the model now has a lot of variables (12 skill + 36 interactions + 4 dummies). At least 100 datapoints will be advisable.

As for regressing using current salary as a regressor, this again leaves potential as a y-variable. And in that case, if there are sub-levels on potential (ie: error), it still leads to a biased model.

You can try to flip the model, and regress salary on potential. If there is unexplained variance, this will indicate a sublevel in the potential -- though to me the results are pretty much a foregone conclusion, given that one can pretty much see that the same potential may lock at different salariy without running any parametric tests.

"I don't know half of you half as well as I should like; and I like less than half of you half as well as you deserve."
This Post:
11
155261.50 in reply to 155261.48
Date: 9/6/2010 2:50:41 AM
Overall Posts Rated:
225225
I don't think the game letting people know when a player reaches potential cap makes the game easy mode, in any aspect of the game, again, other than saving people some time for the weeks of training loss for when soft cap is unknowingly reached.

The problem is that potential is a soft cap and thus isn't reached per se, but training simply slows down as a player approaches his cap. Training never stops completely, either.

Plus, saving yourself the training loss is part of the skills necessary to play the game well.

Last edited by GM-kozlodoev at 9/6/2010 2:52:49 AM

"I don't know half of you half as well as I should like; and I like less than half of you half as well as you deserve."
This Post:
00
155261.51 in reply to 155261.49
Date: 9/6/2010 2:14:38 PM
Overall Posts Rated:
155155
Great suggestions. Can we plan to chat some more once I have more data? I can tell you some more of my observations and you can give more suggestions. Also, if you think of anything else do not hesitate.

Run of the Mill Canadian Manager
This Post:
33
155261.52 in reply to 155261.51
Date: 9/6/2010 9:31:30 PM
Overall Posts Rated:
155155
I just want to also make the comment (if I have not already, I can't remember) that any data sent to me will be kept strictly confidential. Even if I get help from others on this study, I will only discuss data at the macro level and I will not release skills from individual players. And hopefully it also goes without saying that I will not discuss individual players in any public forum.

Run of the Mill Canadian Manager
Advertisement