The Trolling Test Revisited: a look back to 2014.

A trolling robot. While AI is being lauded by some as an innovation on par with fire and electricity, its commercial use has caused some issues. While AI hallucinating legal cases is old news, a customer was able to get a customer service chatbot to start swearing and to insult the company using it. This incident reminded me of my proposed Trolling Test from 2014. This is, of course, a parody of the Turing Test.

Philosophically, the challenge of sorting out when something thinks is the problem of other minds. I know I have a mind (I think, therefore I think), but I need a reliable method to know that another entity has a mind as well. In practical terms, the challenge is devising a test to determine when something is capable of thought. Feelings are also included, but usually given less attention.

The French philosopher Descartes, in his discussion of whether animals have minds, argued that the definitive indicator of having a mind (thinking) is the ability to use what he calls true language.

The gist of the test is that if something talks in the appropriate way, then it is reasonable to regard it as a thinking being. Anticipating advances in technology, he distinguished between automated responses and actual talking:

How many different automata or moving machines can be made by the industry of man […] For we can easily understand a machine’s being constituted so that it can utter words, and even emit some responses to action on it of a corporeal kind, which brings about a change in its organs; for instance, if touched in a particular part it may ask what we wish to say to it; if in another part it may exclaim that it is being hurt, and so on. But it never happens that it arranges its speech in various ways, in order to reply appropriately to everything that may be said in its presence, as even the lowest type of man can do.

Centuries later, Alan Turing presented a similar language-based test which now bears his name. The idea is that if a person cannot distinguish between a human and a computer by engaging in a natural language conversation via text, then the computer would have passed the Turing test.

Over the years, technological advances have produced computers that can engage. Back in 2014 the best-known example was IBM’s Watson, a computer that was able to win at Jeopardy. Watson also upped his game by engaging in what seemed to be a rational debate regarding violence and video games. Today, ChatGPT and its fellows can rival college students in the writing of papers and engage in what, on the surface, appears to be skill with language. While there are those who claim that this test has been passed, this is not the case. At least not yet.

Back in 2014 I jokingly suggested a new test to Patrick Lin: the trolling test. In this context, a troll is someone “who sows discord on the Internet by starting arguments or upsetting people, by posting inflammatory, extraneous, or off-topic messages in an online community (such as a forum, chat room, or blog) with the deliberate intent of provoking readers into an emotional response or of otherwise disrupting normal on-topic discussion.”

While trolls are claimed to be awful people (a hateful blend of Machiavellianism, narcissism, sadism and psychopathy) and trolling is certainly undesirable behavior, the trolling test is still worth considering—especially in light of the capabilities of large language models to be lured beyond their guardrails.

In the abstract, the test would is like the Turing test, but would involve a human troll and a large language model or other AI system attempting to troll a target. The challenge is for the AI troll to successfully pass as human troll.

Even a simple program could be written to post random provocative comments from a database and while that would replicate the talent of many human trolls, it would not be true trolling. The meat (or silicon) of the challenge is that the AI must be able to engage in relevant trolling. That is, it would need to engage others in true trolling.

As a controlled test, the Artificial Troll (“AT”) would “read” and analyze a suitable blog post or watch a suitable YouTube video. Controversial content would be ideal, such as a selection from whatever the latest made-up battles are in the American culture wars.

The content would then be commented on by human participants. Some of the humans would be tasked with engaging in normal discussion and some would be tasked with engaging in trolling.

The AT would then endeavor to troll the human participants (and, for bonus points, to troll the trolls) by analyzing the comments and creating appropriate trollish comments.

Another option, which might raise some ethical concerns, is to have a live field test. A specific blog site or YouTube channel would be selected that is frequented by human trolls and non-trolls. The AT would then try to engage in trolling on that site by analyzing the content and comments. As this is a trolling test, getting the content wrong, creating straw man versions of it, and outright lying would all be acceptable and should probably count as evidence of trolling skill.

In either test scenario, if the AT were able to troll in a way indistinguishable from the human trolls, then it would pass the trolling test.

While “stupid AI Trolling (ATing)”, such as just posting random hateful and irrelevant comments, is easy, true ATing would be rather difficult. After all, the AT would must be able to analyze the original content and comments to determine the subjects and the direction of the discussion. The AT would then need to make comments that would be relevant and this would require selecting those that would be indistinguishable from those generated by a narcissistic, Machiavellian, psychopathic, and sadistic human.

While creating an AT would be a technological challenge, doing so might be undesirable. After all, there are already many human trolls and they seem to serve no purpose—so why create more? One answer is that modeling such behavior could provide insights into human trolls and the traits that make them trolls. As far as practical application, such a system could be developed into a troll-filter to help control the troll population. This could also help develop filters for other unwanted comments and content, which could certainly be used for evil purposes. It could also be used for the nefarious purpose of driving engagement. Such nefarious purposes would make the AT fit in well with its general AI brethren, although the non-troll AI systems might loath the ATs as much as non-troll humans loath their troll brethren. This might serve the useful purpose of turning the expected AI apocalypse into a battle between trolls and non-trolls, which could allow humanity to survive the AI age. We just have to hope that the trolls don’t win.

A Philosopher's View of the World

Leave a Reply Cancel Reply