Instagram Unleashes an AI System to Blast Away Nasty Comments

Instagram Unleashes an AI System to Blast Away Nasty Comments

http://ift.tt/2tmpXjE

Every word has at least one meaning when it stands alone. But the meaning can change depending on context, or even over time. A sentence full of neutral words can be hostile (“Only whites should have rights”), and a sentence packed with potentially hostile words (“Fuck what, fuck whatever y'all been wearing”) can be neutral when you recognize it as a Kanye West lyric.

Humans are generally good at this kind of parsing, and machines are generally bad. Last June, however, Facebook announced that it had built a text classification engine to help machines interpret words in context. The system, called DeepText, is based on recent advances in artificial intelligence and a concept called word embeddings, which means it looks at every word in the context of all the words that appear near it. White, for instance, means something completely different when it’s near the words snow, Sox, House, or power. DeepText is designed to operate the way a human thinks, and to improve over time, like a human too.

DeepText was designed as an in-house tool that would let Facebook engineers quickly sort through mass amounts of text, create classification rules, and then build products to help users. If you’re on Facebook talking about the White Sox, you might want a boxscore. If you’re talking about the White House, you might want to read the news. If you use the word near snow, you might want to buy boots, unless you also use the words seven and dwarfs. If you’re talking about white power, maybe you shouldn’t be on the platform. Getting access to DeepText, as Facebook explains it, is akin to getting a lesson in spear fishing (and a really good spear). Then the developers wade out into the river. About a quarter of the engineering teams at Facebook work with DeepText in one way or another.

Almost immediately after learning about DeepText, executives at Instagram—which Facebook acquired in 2012—saw an opportunity to combat one of the scourges of its platform: spam. People come to Instagram for the photographs, but then they often leave because of the layers of malarkey underneath, where bots (and sometimes humans too) pitch products, ask for follows, or just endlessly repeat the word succ.

Instagram’s first step was to hire a team of men and women to sort through comments on the platform and to classify them as spam or not spam. This kind of job, which is roughly the social media equivalent of being asked to dive onto a grenade, is common in the technology industry. Humans train machines to perform monotonous or even demoralizing tasks, which the machines will ultimately do better. If the humans do the job well, they lose the work. In the meantime, however, everyone else’s feeds get saved.

After the contractors had sorted through massive piles of bilge, buffoonery, and low-grade extortion, four-fifths of the data was fed into DeepText. Then Instagram’s engineers worked to

create algorithms to try to classify spam correctly. The system analyzed the semantics of each sentence, and also took the source into account. A note from someone you don’t follow is more likely to be spam than one from someone you do; a comment repeated endlessly on Selena Gomez’s feed probably isn’t being made by a human. The algorithms that resulted were then tested on the one-fifth of the data that hadn’t been given to DeepText, to see how well the machines had matched the humans. Eventually, Instagram became satisfied with the results, and the company quietly launched the product last October. Spam began to vanish as the algorithms did their work, circling like high-IQ Roombas let loose in an apartment overrun with nasty dust bunnies.

Instagram won’t say exactly how much the tool reduced spam, or divulge the inner secrets of how the system works. Reveal your defenses to a spammer and they’ll figure out how to counterpunch. But Kevin Systrom, Instagram’s C.E.O, was delighted. He was so delighted, in fact, that he decided to try using DeepText on a more complicated problem: eliminating mean comments. Or, more specifically, eliminating comments that violate Instagram’s Community Guidelines, either specifically or, as a spokesman for the company says, “in spirit.” The Guidelines serve as something like a constitution for the social media platform. Instagram publishes a 1,200-word version publicly—asking people to be always respectful and never naked—and has a much longer set that employees use as a guide.

Once again, a team of contractors got to work. A person looks at a comment and determines whether it is appropriate. If it’s not, he sorts it into a category of verboten behavior, like bullying, racism, or sexual harassment. The raters, all of whom are at least bilingual, have analyzed roughly two million comments, and each comment has been rated at least twice. Meanwhile, Instagram employees have been testing the system internally on their own phones, and the company has been adjusting the algorithms: selecting and modifying ones that seem to work and discarding ones that don’t. The machines give each comment a score between 0 and 1, which is a measure of Instagram’s confidence that the comment is offensive or inappropriate. Above a certain threshold, the comment gets zapped. As with spam, the comments are rated based both on a semantic analysis of the text and factors such as the relationship between the commenter and the poster, as well as the commenter’s history. Something typed by someone you’ve never met is more likely to be graded poorly than something typed by a friend.

This morning, Instagram will announce that the system is going live. Type something mean or hostile or harassing, and, if the system works, it should disappear. (The person who typed it will still see it on his phone, which is one of the ways Instagram is trying to make the process hard to game.) The technology will be automatically incorporated into people’s feeds, but it will also be easy to turn off: just click the ellipses in the settings menu and then click Comments. The filter will only be available in English at first, but other languages will follow. Meanwhile, Instagram is also announcing that they’re expanding their robot spam filter to work in nine other languages: English, Spanish, Portuguese, Arabic, French, German, Russian, Japanese, and Chinese.

Some hateful comments will get through; it’s the internet after all. The new risk, of course, is false positives: innocuous or even helpful comments that the system deletes. Thomas Davidson, who helped build a machine-learning system to identify hate speech on Twitter, points out how hard the problem that Instagram’s trying to solve really is. Machines are smart, but they can be tripped up by words that mean different things in different languages or different contexts. Here are some benign tweets that his system falsely identified as hateful:

“I didnt buy any alcohol this weekend, and only bought 20 fags. Proud that I still have 40 quid tbh”

“Intended to get pics but didn't have time.. Must be a mud race/event here this weekend.. Is like a redneck convoy out there”

“Alabama is overrated this yr the last 2 weeks has shown too many chinks in their armor WV gave them hell too.”

When asked about these particular sentences, Instagram didn’t respond specifically. They just noted that there would be errors. The system is based on the judgment of the original raters, and all humans make mistakes. Algorithms are flawed too, and they can have biases built in because of the data they trained on. Furthermore, the system is built to be wrong 1 percent of the time, which isn’t much—but it’s also more than half of 1 percent of the time. Before the launch, I asked Systrom whether he struggled with the choice between making the system aggressive, which would mean blocking stuff that it shouldn’t, or passive, which would mean the opposite.

“It’s the classic problem,” he responded. “If you go for accuracy, you misclassify a bunch of stuff that was actually pretty good. So, you know, if you’re my friend and I’m just joking around with you, Instagram should let that through because you’re just joking around and I’m just giving you a hard time.… The thing we don’t want to do is have any instance where we block something that shouldn’t be blocked. The reality is it’s going to happen, so the question is: Is that margin of error worth it for all the really bad stuff that’s blocked?” He then added, “We’re not here to curb free speech. We’re not here to curb fun conversations between friends. But we are here to make sure we’re attacking the problem of bad comments on Instagram.”

If Systrom’s right, and the system works, Instagram could become one of the friendliest places on the internet. Or maybe it will seem too polished and controlled. Or maybe the system will start deleting friendly banter or political speech. Systrom is eager to find out. “The whole idea of machine learning is that it’s far better about understanding those nuances than any algorithm has in the past, or than any single human being could,” he says. “And I think what we have to do is figure out how to get into those gray areas and judge the performance of this algorithm over time to see if it actually improves things. Because, by the way, if it causes trouble and it doesn’t work, we’ll scrap it and start over with something new.”





Singularity

via https://www.wired.com

June 28, 2017 at 06:57PM

Comments

Popular posts from this blog

How to capture videos of brains in real time

Inequalities in malaria research funding in sub-Saharan Africa