The researchers report that it was relatively easy to spot FBUs and to convert what they had found to something a computer could understand—starting with what they called an Automated Readability Index. After writing their algorithm and working out issues, the team reports that they were able to spot FBUs with an 80 percent accuracy rate after just ten posts. That is not high enough for web sites owners, of course, banning non-trolls by mistake 20 percent of the time could lead to driving away visitors—but it could possibly be used as a way to assist moderators.
They haven't posted the actual calculation, so it's not necessarily clear what they have done.
But this just set off my Bayesian badness detector.
If we read this to mean that this algorithm has an 80% reliability at determining if a post is from a Troll [T]
or a Regular User [RU], then the number of false positives is not 20% [except in a very narrow and improbable
set of circumstances]
Because the number of false positives is going to depend on the ratio of T to RU.
If we say for the sake of argument that 20% of posters are T, then this algorithm will correctly label 80% of that
20% as T. [0.8*0.2=0.16] [16% of the total]
However the algorithm will also incorrectly label 20% of the 80% of RU as T. [0.2*0.8=0.16] [16% of the total]
So we land up with 32% [16%+16%] of all users labelled as T, and of those only half are actually T.
If we ban all those who the algorithm identifies as T, then we will be incorrectly banning 16% of the users [and failing
to ban 20% of the actual trolls, or 4% of the user base total].
So the "banning non-trolls by mistake" number is 16%.
But lets make it so that only 5% of users are trolls...
If we say for the sake of argument that 5% of posters are T, then this algorithm will correctly label 80% of that
5% as T. [0.05*0.8=0.04] [4% of the total]
However the algorithm will also incorrectly label 20% of the 95% of RU as T. [0.95*0.2=0.19] [19% of the total]
So we land up with 23% [4%+19%] of all users labelled as T, and of those only ~17.4% are actually T. [(4/23)*100]
So in this case we have 19% of the total userbase incorrectly labelled as trolls.
The answer will converge on 20% if the proportion of trolls is very small. However I have been assuming that the
false positive rate is the same as the false negative rate. They may not be, and we don't know which [if either]
was being referred to.
The prior probability matters.