Go back
RHP Site Player Statistics

RHP Site Player Statistics

Only Chess

Vote Up
Vote Down

Originally posted by petrovitch
Okay, I replaced the mean rank with the differences between the evaluation of the move chosen and the move suggested (in the sample blitz game). I go the sum then divided by the number of moves to get a mean score of 0.63
As well as the difference in evaluation, it may be necessary to consider the implication of this difference. For example, a human player in a winning position may decide to keep things simple and accept a safe 4.0 move when a complicated 7.0 existed. I don’t think this difference of 3 should always be heavily penalised. However, someone getting -1.2 when 1.2 was obtainable is much more significant - in many cases, this is playing a losing move when a winning move was available. So maybe considering if the likely game outcome has changed would be useful.

Vote Up
Vote Down

Originally posted by Varenka
As well as the difference in evaluation, it may be necessary to consider the implication of this difference. For example, a human player in a winning position may decide to keep things simple and accept a safe 4.0 move when a complicated 7.0 existed. I don’t think this difference of 3 should always be heavily penalised. However, someone getting -1.2 when ...[text shortened]... ove was available. So maybe considering if the likely game outcome has changed would be useful.
I'm not convinced that a rank sum method would not work. Ranked moves already take into consideration the differences in evaluation. The problem appears that mistakes are not given enough weight. Again, if we look at this problem as something similar to a standard deviation then if we square the rank order it would magnify the mistakes. A single move, ranked 25, may not carry enough weight after we divide it by n, but 25**2/n would carry much more weight. The reason I compared it to a standard deviation is because I am saying that Crafty (or the engine of your choice) would be our standard and we want to measure how much we differ from this standard. I realize Crafty is does not represent perfection, but I would consider it a standard, an index if you will, because it does not suffer from emotions, mood swings, fatigue, etc. So some performance evaluation may tell me that I played at 0.75 or I beat a lower rated player, but my performance was only 0.68 or I lost to a higher rated player, but my performance was 0.88 So this measure would tell us something in addition to our rating, and whether we won or lost.

Vote Up
Vote Down

Originally posted by petrovitch
I'm not convinced that a rank sum method would not work. Ranked moves already take into consideration the differences in evaluation.
I think too much information is lost. Sometimes the 10th ranked move is vastly inferior to the 1st move, and sometimes it can be just as good. Rank alone does not include difference in evaluation (well, not the magnitude of the difference).

Supposing I play very close to Crafty in the initial phase of a game, and hence my initial measure is good. But then in an overwhelming position, I have many ways to win and I start to differ from Crafty significantly in how I prefer to convert my advantage, but never giving away the win. Then my overall measure drops significantly and we’d rather it didn’t.

Vote Up
Vote Down

Keep it coming, super interesting thread...

13 edits
Vote Up
Vote Down



This is a game between two lower rated player, 1404/1363. Black's next move was c5. What value do I give black for missing the threat of mate? 🙄

It has a rank of 29. What is the difference in value between the move suggested by Crafty and the move selected by player?

Using the rank method here are the scores of the winner/loser.
x x2 y y2
N15.0015.0014.0014.00
Sum74.00810.0079.001209.00
Mean4.9354.005.6486.36
Std5.457.357.389.29

Note:

[7.35, 9.29] = sqrt(rank**2/N)

So this was a very weak game. Both sides made some really bad moves. How do we evaluate the performance of each player?

Should we consider the statistical methods of clustering as a data model?

3 edits
Vote Up
Vote Down

I don't want to change the subject, but I guess we can run more than one idea on the same thread since the thread is about statistics.

While thumbing through a lot of data trying to find an answer to the problem of formulating a game performance rating I quickly noticed that there is a strong correlation with the amount of turnover and a player's rating.

First, let me illustrate how I manage games in my folders ... the organization helps. My friend, thgibbs, has also written a program that will automatically categorize your games once they are complete. That helps a lot too.

___________________________________________________________

Folders
All games
Inbox
0-999
1000-1199
1200-1399
1400-1599
1600-1799
1800-1999
2000 - 2199
2200-2399
Archive
___________________________________________________________

Of players in these folders I find:

0-999 11 unique players. None are still active.

1000-1199 13 unique players. 1 is still active.

1200-1399 18 unique players. 1 is still active.

1400-1599 14 unique players. 7 are still active.

1600-1799 17 unique players. 5 are still active.

1800-1999 11 unique players. 5 are still active.

2000 - 2199 12 unique players. 10 are still active.

2200-2399 1 unique player. 0 still active.

* I have played a lot of games with a few players. This is only a list of unique players.

___________________________________________________________

So the renewal rate is much higher for players with ratings above 1400. According to an earlier post that was about average. So it looks like renewing members are exclusively in the upper half -- almost non-existant in the lower half. So if you are not serious about learning the game you will not survive. 🙂

These conclusions only come from a small sample of players whom I have played.

============================================

Vote Up
Vote Down

Very interesting, maybe this happens in OTB also.

A player who never increases his rating can get frustrated and leave the game, also a player who doesn't get better is probably because he is not interested enough in the game and of course because of the same reason he will not last much playing it.

I think this two reasons apply to OTB as well.

1- Frustration

2- Low interest