Posers and Puzzles
28 May 06
Originally posted by xcomradexBy this calculation there should be around 8.5 active players with a rating of 1570 (11667*0.073*0.01). Actually there are 5 as I'm writing this.
nah you're right, thats were i was going with post #2. but we are short of some information.
But, from the player tables we know there are 11 667 players at rhp, so the median player as around position 5834. i can't be bothered sorting through the tables, but i know a player ranked 5794th, which is close enough. he has a rank of 1274, so the median rank ...[text shortened]... y selected games on rhp, around 730 of them will be against people with the same ranking. qed.
Also note this is a much more useful figure as it doesn't talk about playing games so it removes all discussion of bias due to game selection.
Question: How unlikely is this result (5 if the average is 8.5)?
Originally posted by XanthosNZ8.5 vs 5, thats seems to show a pretty good estimate. as per your question, we'd need to know the standard deviation (sd), ie. how often are the predicted numbers of players found overall, and how much does the answer differ by when the prediction is wrong. but just using an estimate, sd=3, you'd expect to be see 8.5+/-3 players around 68% of the time. going with a looser sd, sd=6, gives us 8.5+/-3 around 38% of the time. going with a tighter sd, sd=1.5, we'd expect to see 8.5+/-3 around 95% of the time. but the answer is only is good as the assumptions you make.
By this calculation there should be around 8.5 active players with a rating of 1570 (11667*0.073*0.01). Actually there are 5 as I'm writing this.
Also note this is a much more useful figure as it doesn't talk about playing games so it removes all discussion of bias due to game selection.
Question: How unlikely is this result (5 if the average is 8.5)?
Originally posted by xcomradexWhere did you pull a sd of 3 from? You should be able to work out at least an approximation of it from the population standard deviation.
8.5 vs 5, thats seems to show a pretty good estimate. as per your question, we'd need to know the standard deviation (sd), ie. how often are the predicted numbers of players found overall, and how much does the answer differ by when the prediction is wrong. but just using an estimate, sd=3, you'd expect to be see 8.5+/-3 players around 68% of the time. goi ...[text shortened]... e 8.5+/-3 around 95% of the time. but the answer is only is good as the assumptions you make.
Originally posted by XanthosNZi plucked three out of the air, based on reasonable estimates from a normal distribution, assuming that 5 is a relatively likely deviation from the mean. i can't see a way to get the sd easily out of the population sd, but i am practising stats without a license.
Where did you pull a sd of 3 from? You should be able to work out at least an approximation of it from the population standard deviation.
Originally posted by xcomradexAnd here's a nice example of what not to do. You assume 5 is a relatively likely result to achieve and then from that concluded that 5 is a reasonable result. Can you see the circular reasoning here?
i plucked three out of the air, based on reasonable estimates from a normal distribution, assuming that 5 is a relatively likely deviation from the mean. i can't see a way to get the sd easily out of the population sd, but i am practising stats without a license.
Originally posted by xcomradexXanthos' comments are constructive. He pointed out that your assumptions were flawed and a new, more refined result should be sought. Your assumptions were created to prove your own solution and were therefore circular. Try to make assumptions that do not support your hypothesis and when they don't produce reasonable numbers then you can look more favorably at your original hypothesis.
yes, thats why i said the result was only as good as the assumptions. got any constructive contributions?
Originally posted by xcomradexThis is a useful calculation, but there is more to consider.
nah you're right, thats were i was going with post #2. but we are short of some information.
But, from the player tables we know there are 11 667 players at rhp, so the median player as around position 5834. i can't be bothered sorting through the tables, but i know a player ranked 5794th, which is close enough. he has a rank of 1274, so the median rank ...[text shortened]... y selected games on rhp, around 730 of them will be against people with the same ranking. qed.
The median player may be around 1250, but what is the median player-game-rating where we define a player-game-rating to be the rating of an instance of a player's side of a game start. In a given month, we may find that players in the 1500s average 20 game starts, but that players in the 800s average 3 game starts. This may be due to a large numbers of people registring for the site, losing 5 or 6 games real quick (perhaps by timeout) and never coming back. Whereas a 1500+ player probably on average completes dozens if not 100s of games.
One would expect that a given player with a rating of 1500 would more frequently start a game than a given player with a rating of 1500. If one does not suspect that, then one would at least have to admit it is very likely the game start frequencies are significantly different.
That being said, I am interested if anyone can search the complete game database of rhp and find the average rating at each player-start. That would be a more meaningful estimate of "average chess" than just taking the average of each player.
Originally posted by Gastelyes i am aware of that, and as was stated in the original post, the methods used were designed to give an estimate. i am open to any new, more refined results that anyone is prepared to offer, and that is what i meant by "constructive", rather than implying his posts are worthless, which they certainly are not.
Xanthos' comments are constructive. He pointed out that your assumptions were flawed and a new, more refined result should be sought. Your assumptions were created to prove your own solution and were therefore circular. Try to make assumptions that do not support your hypothesis and when they don't produce reasonable numbers then you can look more favorably at your original hypothesis.