Please turn on javascript in your browser to play chess.
Science Forum

Science Forum

  1. Standard member sonhouse
    Fast and Curious
    15 Apr '15 20:45
    http://phys.org/news/2015-04-algorithm-online-trolls.html#ajTabs

    Like they are SO hard to identify
  2. 16 Apr '15 13:17
    Originally posted by sonhouse
    http://phys.org/news/2015-04-algorithm-online-trolls.html#ajTabs

    Like they are SO hard to identify
    For people... as ever, what is simple for humans is frequently hard for computers.
  3. 22 Apr '15 01:18
    Originally posted by sonhouse
    http://phys.org/news/2015-04-algorithm-online-trolls.html#ajTabs

    Like they are SO hard to identify
    Have you heard of Orwell?
  4. Standard member sonhouse
    Fast and Curious
    22 Apr '15 12:25
    Originally posted by stevemcc
    Have you heard of Orwell?
    I read about him a long time ago, around 1984 I think.
  5. Standard member Soothfast
    0,1,1,2,3,5,8,13,21,
    23 Apr '15 21:49
    Originally posted by sonhouse
    http://phys.org/news/2015-04-algorithm-online-trolls.html#ajTabs

    Like they are SO hard to identify
    That's rather depressing.
  6. 23 Apr '15 23:07
    http://phys.org/news/2015-04-algorithm-online-trolls.html#ajTabs

    The researchers report that it was relatively easy to spot FBUs and to convert what they had found to something a computer could understand—starting with what they called an Automated Readability Index. After writing their algorithm and working out issues, the team reports that they were able to spot FBUs with an 80 percent accuracy rate after just ten posts. That is not high enough for web sites owners, of course, banning non-trolls by mistake 20 percent of the time could lead to driving away visitors—but it could possibly be used as a way to assist moderators.


    They haven't posted the actual calculation, so it's not necessarily clear what they have done.

    But this just set off my Bayesian badness detector.

    If we read this to mean that this algorithm has an 80% reliability at determining if a post is from a Troll [T]
    or a Regular User [RU], then the number of false positives is not 20% [except in a very narrow and improbable
    set of circumstances]
    Because the number of false positives is going to depend on the ratio of T to RU.

    If we say for the sake of argument that 20% of posters are T, then this algorithm will correctly label 80% of that
    20% as T. [0.8*0.2=0.16] [16% of the total]
    However the algorithm will also incorrectly label 20% of the 80% of RU as T. [0.2*0.8=0.16] [16% of the total]
    So we land up with 32% [16%+16%] of all users labelled as T, and of those only half are actually T.

    If we ban all those who the algorithm identifies as T, then we will be incorrectly banning 16% of the users [and failing
    to ban 20% of the actual trolls, or 4% of the user base total].

    So the "banning non-trolls by mistake" number is 16%.

    But lets make it so that only 5% of users are trolls...

    If we say for the sake of argument that 5% of posters are T, then this algorithm will correctly label 80% of that
    5% as T. [0.05*0.8=0.04] [4% of the total]
    However the algorithm will also incorrectly label 20% of the 95% of RU as T. [0.95*0.2=0.19] [19% of the total]
    So we land up with 23% [4%+19%] of all users labelled as T, and of those only ~17.4% are actually T. [(4/23)*100]

    So in this case we have 19% of the total userbase incorrectly labelled as trolls.

    The answer will converge on 20% if the proportion of trolls is very small. However I have been assuming that the
    false positive rate is the same as the false negative rate. They may not be, and we don't know which [if either]
    was being referred to.


    The prior probability matters.

    https://xkcd.com/1132/
  7. Standard member wolfgang59
    Infidel
    28 Apr '15 00:23
    Originally posted by sonhouse
    I read about him a long time ago, around 1984 I think.
    ... while holidaying on a farm ....
  8. 28 Apr '15 07:27
    I have an algorithm for trolls:

    if his name is Metal Brain or RJHinds, he is a troll, else not so.

    This algorithm has several severe limitations.