Algorithm to detect trolls!:

sonhouse

Fast and Curious

Science

15 Apr 15 20:45

sonhouse

Fast and Curious

slatington, pa, usa

Joined: 28 Dec 04
Moves: 53419

15 Apr 15 20:45

http://phys.org/news/2015-04-algorithm-online-trolls.html#ajTabs

Like they are SO hard to identify🙂

googlefudge

Joined: 31 May 06
Moves: 1795

16 Apr 15 13:17

Originally posted by sonhouse
http://phys.org/news/2015-04-algorithm-online-trolls.html#ajTabs

Like they are SO hard to identify🙂

For people... as ever, what is simple for humans is frequently hard for computers.

stevemcc

Joined: 15 Oct 10
Moves: 98630

22 Apr 15 01:18

Originally posted by sonhouse
http://phys.org/news/2015-04-algorithm-online-trolls.html#ajTabs

Like they are SO hard to identify🙂

Have you heard of Orwell?

sonhouse

Fast and Curious

slatington, pa, usa

Joined: 28 Dec 04
Moves: 53419

22 Apr 15 12:25

Originally posted by stevemcc
Have you heard of Orwell?

I read about him a long time ago, around 1984 I think.

Soothfast

0,1,1,2,3,5,8,13,21,

☯️

Joined: 04 Mar 04
Moves: 2746

23 Apr 15 21:49

Originally posted by sonhouse
http://phys.org/news/2015-04-algorithm-online-trolls.html#ajTabs

Like they are SO hard to identify🙂

That's rather depressing.

googlefudge

Joined: 31 May 06
Moves: 1795

23 Apr 15 23:07

-1

http://phys.org/news/2015-04-algorithm-online-trolls.html#ajTabs

The researchers report that it was relatively easy to spot FBUs and to convert what they had found to something a computer could understand—starting with what they called an Automated Readability Index. After writing their algorithm and working out issues, the team reports that they were able to spot FBUs with an 80 percent accuracy rate after just ten posts. That is not high enough for web sites owners, of course, banning non-trolls by mistake 20 percent of the time could lead to driving away visitors—but it could possibly be used as a way to assist moderators.

They haven't posted the actual calculation, so it's not necessarily clear what they have done.

But this just set off my Bayesian badness detector.

If we read this to mean that this algorithm has an 80% reliability at determining if a post is from a Troll [T]
or a Regular User [RU], then the number of false positives is not 20% [except in a very narrow and improbable
set of circumstances]
Because the number of false positives is going to depend on the ratio of T to RU.

If we say for the sake of argument that 20% of posters are T, then this algorithm will correctly label 80% of that
20% as T. [0.8*0.2=0.16] [16% of the total]
However the algorithm will also incorrectly label 20% of the 80% of RU as T. [0.2*0.8=0.16] [16% of the total]
So we land up with 32% [16%+16%] of all users labelled as T, and of those only half are actually T.

If we ban all those who the algorithm identifies as T, then we will be incorrectly banning 16% of the users [and failing
to ban 20% of the actual trolls, or 4% of the user base total].

So the "banning non-trolls by mistake" number is 16%.

But lets make it so that only 5% of users are trolls...

If we say for the sake of argument that 5% of posters are T, then this algorithm will correctly label 80% of that
5% as T. [0.05*0.8=0.04] [4% of the total]
However the algorithm will also incorrectly label 20% of the 95% of RU as T. [0.95*0.2=0.19] [19% of the total]
So we land up with 23% [4%+19%] of all users labelled as T, and of those only ~17.4% are actually T. [(4/23)*100]

So in this case we have 19% of the total userbase incorrectly labelled as trolls.

The answer will converge on 20% if the proportion of trolls is very small. However I have been assuming that the
false positive rate is the same as the false negative rate. They may not be, and we don't know which [if either]
was being referred to.

The prior probability matters.

https://xkcd.com/1132/

wolfgang59

Quiz Master

RHP Arms

Joined: 09 Jun 07
Moves: 48794

28 Apr 15 00:23

Originally posted by sonhouse
I read about him a long time ago, around 1984 I think.

... while holidaying on a farm ....

humy

Joined: 06 Mar 12
Moves: 642

28 Apr 15 07:27

I have an algorithm for trolls:

if his name is Metal Brain or RJHinds, he is a troll, else not so.

This algorithm has several severe limitations.