{"id":123409,"date":"2023-11-07T15:50:50","date_gmt":"2023-11-07T15:50:50","guid":{"rendered":"https:\/\/www.techopedia.com"},"modified":"2023-11-07T15:50:50","modified_gmt":"2023-11-07T15:50:50","slug":"what-happens-when-ai-is-tricked","status":"publish","type":"post","link":"https:\/\/www.techopedia.com\/what-happens-when-ai-is-tricked","title":{"rendered":"What Happens When AI is Tricked?"},"content":{"rendered":"
In some segments of the popular imagination, artificial intelligence<\/a> (AI) is all-knowing all-powerful and will soon rid the planet of inferior biological intelligence. However, the fact is that AI is not all that smart \u2013 in reality, it can be fooled quite easily.<\/p>\n Sometimes, the results can be amusing, but sometimes not. The key determining factor will be the way we develop and implement AI to prevent it from being tricked and from being used to deceive others.<\/p>\n Already, AI has shown that it is perfectly capable of fooling humans using deception, misdirection, and even outright lies. One of the clearest examples is a model called Cicero, developed by Meta to play a world-conquest game called Diplomacy. As outlined in a recent post on The Conversation, Cicero uses lies and deceit to trick other (human) players into believing it was their ally<\/a> when it was conspiring with their enemies.<\/p>\n Elsewhere, large language models<\/a> (LLMs) like ChatGPT<\/a> have successfully convinced people and bot-checker apps like Captcha that they were real humans, not just through simple mimicry but by intentionally lying about it.<\/p>\n In response to this subterfuge, many organizations are turning to AI to help determine if text, speech, or other content has or has not been generated by AI.<\/p>\n High schools, universities, and other educational institutions, for example, routinely subject written documents like term papers to AI-powered inspection. But even here, the AI-detection models are proving frustratingly easy to fool<\/a>. As TechHQ.com showed recently, many can be defeated simply by making minor changes to the AI-generated text.<\/p>\n The group tested five leading AI content detectors, all performing very well when examining unaltered AI text. However, with some very light editing \u2013 breaking sentences in half, rearranging a few words \u2013 the certainty of their conclusions started diminishing. And when actual typos were introduced, some detectors increased their probability that it was human-generated content.<\/p>\n Sometimes, however, deceiving AI can be seen in a positive light, depending on what a given model is trained to do. A new tool called Nightshade, developed at the University of Chicago, is designed to thwart intelligent programs that scour the web to steal copywritten visual content<\/a>, such as artwork and photographs. It introduces “prompt-specific poisoning attacks” that trick the model into classifying an image as something else. Instead of a building, for example, the image is recorded as an animal or plant.<\/p>\n This effectively destabilizes the model’s training, making it useless when tasked with creating a desired image. Creator Ben Zhao claims only a few hundred false images can permanently disrupt a model, even those built on popular platforms like DALL-E<\/a>, MidJourney<\/a>, and Stable Diffusion<\/a>. Ultimately, the goal is to provide a digital means of protecting intellectual property from those who would use it to create AI-generated content.<\/p>\n Outwitting AI is likely to become a central facet in the ongoing cyberwars as well, and this is where even seemingly innocuous tools can be turned into weapons<\/a>. The University of Sheffield recently conducted several tests on text-to-SQL systems commonly used in large language model training to translate human questions into database queries.<\/p>\n Depending on the wording of the text, these programs showed a propensity to generate code that can steal data, issue malicious code, and even launch Denial-of-Service<\/a> attacks.<\/p>\n In some cases, these results arise without the understanding or even knowledge of the person who made the query. A nurse looking to access clinical records, for example, could alter a database in ways that jam up its management software.<\/p>\nIntelligent Deception<\/span><\/h2>\n
Fool AI Once…<\/span><\/h2>\n
Poison AI and False Imagery<\/span><\/h2>\n
Cyber Trickery<\/span><\/h2>\n