Personalized warnings could reduce hate speech on Twitter, researchers say

A set of carefully-worded warnings could help reduce the amount of hate. That is the conclusion of a new research that looked at whether targeted warnings could reduce hate speech on the platform.

The Center for Social Media and Politics at New York University found that personalized warnings on the consequences of their behavior reduced the number of hate-filled messages. The experiment suggests that there is a path forward for platforms to reduce the use of hate speech by users.

Researchers identified accounts at risk of being suspended for breaking the rules against hate speech. They were looking for people who had used at least one word in a hate language dictionary over the previous week, who also followed at least one account that had recently been suspended.

The researchers created test accounts with personas such as "hate speech warner" and used them to send warnings to people. They all had the same message, that using hate speech puts them at risk of being suspended, and that it had already happened to someone they follow.

One sample message shared in the paper said that the account was suspended because of the language it used. If you continue to use hate speech, you might be suspended temporarily. The account doing the warning identified themselves as a professional researcher, while also letting the person know they were at risk of being suspended. Yildirim tells Engadget that they tried to be convincing.

The warnings were effective in the short term. The authors say that only one warning message sent by an account with no more than 100 followers can decrease the ratio of hate speech on the internet. They found that messages that were more politely phrased led to a decrease of up to 20 percent. Yildirim says that they tried to increase politeness by saying that they respect your right to free speech, but that you should keep in mind that your hate speech might harm others.

The test accounts that Yildirim and his co-authors wrote about only had 100 followers each, and they weren't associated with an authoritative entity. If the same type of warnings were to come from the same organization, then the warnings would be even more useful. Yildirim says that the real mechanism at play could be the fact that we let these people know that there's an account that is watching and monitoring their behavior. The use of hate speech could be seen by someone else and that could be a factor in the decrease of hate speech.