For the uninitiated, adversarial data describes a situation in which human users intentionally supply an algorithm with corrupted information. The corrupted data throws off the machine learning process, tricking the algorithm into reaching fake conclusions or incorrect predictions.
A malicious attack of this nature could easily result in a fatal accident. Similarly, compromised algorithms could lead to faulty biomedical research, endangering lives or delaying life-saving innovations.
Adversarial data has only recently begun to be recognized for the threat it is – and it can’t go overlooked any longer.
How does adversarial data occur?
In the study, researchers separated “robust” and “non-robust” characteristics during AI learning. Robust features are what humans typically perceive, while non-robust features are only detected by AI. An attempt at having an algorithm recognize pictures of cats revealed that the system was looking at real patterns present in the images to draw incorrect conclusions.
The misidentification occurred because the AI was looking at an apparently imperceivable set of pixels that led it to improperly identify photos. This caused the system to be inadvertently trained to use misleading patterns in its identification algorithm.
These non-robust characteristics served as a type of interfering “noise” that led to flawed results from the algorithm. As a result, for hackers to interfere with AI, they often simply need to introduce a few non-robust characteristics – things that aren’t easily identified by human eyes, but that can dramatically change AI output.
Potential consequences of adversarial data and Dark AI
In a biomedical setting I’d be familiar with, for example, attacks could induce an algorithm to incorrectly label harmful or contaminated samples as clean and benign. This can lead to misguided research results or incorrect medical diagnoses.
Machine learning can be especially efficient at bypassing unsecured IoT devices, giving hackers an easier avenue to stealing confidential data, falsely manipulating company databases and more. Essentially, a “dark AI” tool could be used to infect or manipulate other AI programs with adversarial data. SMBs – small to medium sized businesses – are often at a higher risk of these attacks because they do not have as advanced of cybersecurity metrics.
Despite these issues, adversarial data can also be used for good. Indeed, many developers have begun using adversarial data to uncover system vulnerabilities on their own, allowing them to implement security upgrades before hackers can take advantage of the weakness. Other developers are using machine learning to create AI systems that are more adept at identifying and eliminating potential digital threats.
He continues, “This approach differs from traditional IT security, which has been focused more on identifying specific files and programs known to bear threats – rather than studying how those files and programs behave.”
Of course, improvements in machine learning algorithms themselves can also help reduce some of the risks presented by adversarial data. What is most important, however, is that these systems aren’t left completely on their own. Manual input and human supervision remain essential for identifying discrepancies between robust and non-robust characteristics to ensure that a corrupted reading doesn’t lead to flawed outcomes. Additional training that utilizes real correlations can further reduce AI’s vulnerability.
It is clear that adversarial data will continue to pose a challenge in the immediate future. But in an era when AI is being used to help us better and solve a variety of world problems, the importance of addressing this data-driven threat cannot be understated. Dealing with adversarial data and taking steps to counter understand the human brain dark AI should become one of the tech world’s top priorities.
This post is part of our contributor series. The views expressed are the author’s own and not necessarily shared by TNW.
Published June 17, 2019 – 20:30 UTC