Try to inject hate speech, AI will weed it out | Yle News Lab

Image: Teemu Kiviniemi

Tuomo Björksten

25.10.2018

Try to inject hate speech, AI will weed it out

Artificial intelligence has been cleaning up the Yle.fi website comment fields since this past spring. Automatic moderation has already improved the quality of online discussion, writes journalist Tuomo Björksten.

We've got the best hate speech police hard at work here in Pasila. To be precise, we're talking about artificial intelligence, which was brought on line by the Yle News and Current Affairs department in the middle of last May. Artificial intelligence tirelessly, promptly and precisely clears out any inappropriate language from Yle's discussion boards.

This AI moderator is one of the tools we use in an effort to enhance interaction in journalism. We want to promote the kind of discourse that will help build an even better Finnish society. We want to know what's on people's minds, what they think is important.

How can journalists, with their megaphones, know whether they are talking about the right things? By engaging in discussion with people.

I was recently asked what use are discussion boards on news websites. I think the answer can be found in the following metaphor: Think of society as a big market square. We are all there, doing our own things. Some are selling, some are buying and some are just watching the world go by. Journalists are the ones marching around the square with megaphones, filling the people in on what's up. In the old days, everyone quietly and respectfully listened to what they had to say.

Today, people want to tell the journalists what they should be sharing with the people. Many are becoming increasingly more critical of the things they hear. How can journalists, with their megaphones, know whether they are talking about the right things? By engaging in discussion with people.

We opened our articles to comments just before the summer holidays. We wouldn't have done this if computer technology hadn't made such enormous leaps over the past few years. A human being doesn't have the time or, even more so, energy to endlessly sift through the volume of comments that currently pour into Yle.fi articles. You would easily lose your mind trying to do this. A computer, on the other hand, doesn't care.

The decision to open articles up to comments has proven to be a good one. For the most part, discussions are more appropriate than many of us would otherwise believe. Commenters here swear less than on Suomi24, get worked up less than on Facebook and are friendlier than on Twitter.

Naturally, there are always a few bad apples to contend with. We weed them out in two ways. First, the computer determines whether a comment is appropriate or not. Then, a person can look over the cases in which the computer was unable to make a decision or actually made a mistake.

No one assigned a rule for the AI that f*** is a word that should never be used in discussion - the AI figured that out on its own.

The computer decision is based on mathematics, as it doesn't understand Finnish. Everything is based on old data.

Articles had been open to comments on the Yle website years ago, when there were still full-time discussion moderators on the job. Sample data (i.e. actual comments) from that time has been saved, thus providing an idea of what was considered publishable and what wasn't.

These comments were fed into the AI. The computer analysed the comments and independently taught itself what kinds of things are associated with an appropriate comment. No one assigned a rule for the AI that f*** is a word that should never be used in discussion - the AI figured that out on its own. The computer is also constantly learning on the fly when humans correct the decisions it has made.

In theory, the computer should always be able to make a decision one way or the other. But, in reality, it resorts to flipping a coin: there might be a few inappropriate remarks in the comments, but also a lot of well-made arguments.

Swear words are very much banned on our website. If the computer comes across any, the comment will be neatly deleted. The computer is also very adept at identifying insults or language that demeans a certain group of people. The biggest challenge facing AI is filtering out cleverly or subtly snarky comments that are nevertheless insulting a person or group of people. These are the cases where human editors intervene while monitoring the computer.

We often make a decision to completely close down a discussion that can quickly head in the wrong direction. An example of this is news concerning some kind of accident.

In theory, the computer should always be able to make a decision one way or the other. But, in reality, it resorts to flipping a coin: there might be a few inappropriate remarks in the comments, but also a lot of well-made arguments. So, how are these situations supposed to be handled? Uncertain cases like these are also handled by human editors.

Another way to guarantee the quality of discussions is the Yle Login. When users are required to register and log in to the website, it seems to keep the worst comment offenders out of the discussions.

We currently receive some 20,000 comments each month. Approximately 90% of these are accepted, 2% are rejected and just under 10% are uncertain, requiring a human to make an informed decision.

When we introduced the use of AI in moderating discussions, we soon noticed that this isn't rocket science. Most news websites have had discussion boards for ages. And, AI has also been put to use elsewhere. This is why we have to think about what the next step might be.

We've already seen signs of this. Did you happen to read the article by Yle's social welfare and health care reform reporter Tiina Merikanto about a man who had committed suicide? Just under 200 comments were made in a very constructive and informative discussion on suicide, in which Merikanto and experts on the subject participated. The conventional approach to writing an article would have been to just leave it once it was published. The new way of doing things is to extend the article in discussion.

The writer is a Yle News Lab journalist.

Lue tästä suomeksi