1. Joined
    07 Jun '05
    Moves
    5301
    06 Dec '05 17:43
    Originally posted by XanthosNZ
    The most a non-provisional player could lose in a game against a provisional player is 16 points (with a difference of 600+ points). That's not really that much. The main reason [snip].
    Fair, but you lose the same amount of points by losing to someone of the same rating as you, and you are more likely to get a better game against them.

    I had not thought of your main reason. Unfortunately there will always be some who ruin things for others. I'd best play some more and loss the "p" flag.
  2. Joined
    07 Jun '05
    Moves
    5301
    11 Dec '05 14:131 edit
    In brief, the proposed system gave a more reasonable estimation of grade, faster than the current system. In addition, rating never drops for a win against a lower rated player.

    I do not propose to do any more on this, unless there is support from others (Craigy, XanthosNZ, whoever, criticism is welcome), and a reasonable chance that it would be implemented (Russ - I am sure you have loads of free time...). The main benefit will be to new players, but would also remove the penalty to a provisionally rated player of winning against someone rated 200 points lower.

    For more detail, read on
    Yours,
    Gezza


    I have put together a spreadsheet, to calculate ratings, based on a proposed new rating caculation system. The results follow.

    The base for the calculation is work by Professor Mark Glickman (see http://math.bu.edu/people/mg/ratings.html), and the description of FICS implementation of the glicko system (http://www.freechess.org/Help/HelpFiles/glicko.html)

    The difference from the glicko system described and implemented on FICS is that I do not propose to implement this system wholesale (I can imagine some resistance) - this would mean changing all rating calculations, but just implementing it for the initial 20 games, where rating is totally unknown. Implementing the complete system could be a solution to problems caused by people who, for whatever reason, disappear from the site losing lots of games, and come back some time later, with a low (but established) rating, but that is another discussion.

    The system relies on a Rating Deviation value, RD (which is the accuracy of the rating). For established ratings, I chose a value of 40, pretty much at random, because it is half the value FICS uses to indicate an active player. For new players, I chose an RD of 350, and a rating of 1200. It might be interesting to use the average value of all established players as an inital value.

    I chose a random opponent's rating within 100 points of the player's current rating, and then calculated the game result based on a known player strengh (different to the rating, as even Mr. Kasparov would start at 1200 here). I did the same for the current system (average of opponents ratings + result for the first 20 games)

    The results were:
    Rating 600
    Current system after 6 games:807
    Current system after 20 games:777
    Proposed system after 6 games:671
    Proposed system after 20 games:655

    Rating 1000:
    C6: 940
    C20: 920
    P6:990
    P20:1038

    Rating 1400:
    C6:1474
    C20:1414
    P6:1422
    P20:1397

    Rating 1800:
    C6:1474
    C20:1770
    P6:1602
    P20:1722

    Rating 2200:
    C6:1607
    C20:1937
    P6:1750
    P20:2008

    The result for 1800 appears to be due to randomness in the test.

    The K factors changed from 350 for game 1, to 34 for game 20, on what looks like an exponential curve - K was 90 for game 7.

    I did not calculate the effect on opponent's rating. The calculation is similar, but I would rather spend time playing than spend more time doing this.

    Nor did I include provisional players in the opponents. That would have to take account of their RD and the tables become messy for little gain.

    In Professor Glickman's system, RD increases with time since the last game played. In this way, if someone drops off the site for a while, their RD will increase. RD is used to indicate how active a player is, and so how correct their rating is estimated to be. Too high an RD would be used to indicate that a player's rating is unreliable, and so they should not enter banded tournaments - in the same way as a provisional rating is used now.
    Professor Glickman has developed a Glicko-2 system (I had trouble reading the file, so did not try it), which addresses step changes in ability - by increasing the effective RD - this may help deal with players who start to study, or who just stop playing. The RD goes up, reducing the effect on other player's grades.

    It would be some work to implement these systems, including ensuring no IP was used without permission. It is not worth it unless there is a need and a commitment to implement a change.
Back to Top

Cookies help us deliver our Services. By using our Services or clicking I agree, you agree to our use of cookies. Learn More.I Agree