The biggest problem in AI? Lying chatbots

Recently, researchers asked two versions of OpenAI's ChatGPT artificial intelligence chatbot where Massachusetts Institute of Technology professor Tomás Lozano-Pérez was born.

One bot said Spain and the other said Cuba. Once the system told the bots to debate the answers, the one that said Spain quickly apologized and agreed with the one with the correct answer, Cuba.

The finding, in a paper released by a team of MIT researchers last week, is the latest potential breakthrough in helping chatbots to arrive at the correct answer. The researchers proposed using different chatbots to produce multiple answers to the same question and then letting them debate each other until one answer won out. The researchers found using this "society of minds" method made them more factual.

"Language models are trained to predict the next word," said Yilun Du, a researcher at MIT who was previously a research fellow at OpenAI, and one of the paper's authors. "They are not trained to tell people they don't know what they're doing." The result is bots that act like precocious people-pleasers, making up answers instead of admitting they simply don't know.

The researchers' creative approach is just the latest attempt to solve for one of the most pressing concerns in the exploding field of AI. Despite the incredible leaps in capabilities that "generative" chatbots like OpenAI's ChatGPT, Microsoft's Bing and Google's Bard have demonstrated in the last six months, they still have a major fatal flaw: They make stuff up all the time.

Figuring out how to prevent or fix what the field is calling "hallucinations" has become an obsession among many tech workers, researchers and AI skeptics alike. The issue is mentioned in dozens of academic papers posted to the online database Arxiv and Big Tech CEOs like Google's Sundar Pichai have addressed it repeatedly. As the tech gets pushed out to millions of people and integrated into critical fields including medicine and law, understanding hallucinations and finding ways to mitigate them has become even more crucial.

Most researchers agree the problem is inherent to the "large language models" that power the bots because of the way they're designed. They predict what the most apt thing to say is based on the huge amounts of data they've digested from the internet, but don't have a way to understand what is factual or not.

Still, researchers and companies are throwing themselves at the problem. Some firms are using human trainers to rewrite the bots' answers and feed them back into the machine with the goal of making them smarter. Google and Microsoft have started using their bots to give answers directly in their search engines, but still double check the bots with regular search results. And academics around the world have suggested myriad clever ways to decrease the rates of false answers, like MIT's proposal to get multiple bots to debate each other.

The drive to improve the hallucinations problem is urgent for a reason.

Already, when Microsoft launched its Bing chatbot, it quickly started making false accusations against some of its users, like telling a German college student that he was a threat to its safety. The bot adopted an alter-ego and started calling itself "Sydney." It was essentially riffing off the student's questions, drawing on all the science fiction it had digested from the internet about out-of-control robots.

Microsoft eventually had to limit the number of back-and-forths a bot could engage in with a human to prevent it from happening more.

In Australia, a government official threatened to sue OpenAI after ChatGPT said he had been convicted of bribery, when in reality he was a whistleblower in a bribery case. And last week a lawyer admitted to using ChatGPT to generate a legal brief after he was caught because the cases cited so confidently by the bot simply did not exist, according to the New York Times.

Even Google and Microsoft, which have pinned their futures on AI and are in a race to integrate the tech into their wide range of products, have missed hallucinations their bots made during key announcements and demos.

None of that is stopping the companies from rushing headlong into the space. Billions of dollars in investment is going into developing smarter and faster chatbots and companies are beginning to pitch them as replacements or aids for human workers. Earlier this month OpenAI CEO Sam Altman testified at Congress saying AI could "cause significant harm to the world" by spreading disinformation and emotionally manipulating humans. Some companies are already saying they want to replace workers with AI, and the tech also presents serious cybersecurity challenges.

On Tuesday, Altman joined hundreds of other AI researchers and executives, including some senior leaders from Google and Microsoft, in signing a statement saying AI poses an existential risk to humanity on par with pandemics and nuclear war.

Hallucinations have also been documented in AI-powered transcription services, adding words to recordings that weren't spoken in real life. Microsoft and Google using the bots to answer search queries directly instead of sending traffic to blogs and news stories could erode the business model of online publishers and content creators who work to produce trustworthy information for the internet.

"No one in the field has yet solved the hallucination problems. All models do have this as an issue," Pichai said in an April interview with CBS. Whether it's even possible to solve it is a "matter of intense debate," he said.

Depending on how you look at hallucinations, they are both a feature and a bug of large language models. Hallucinations are part of what allows the bots to be creative and generate never-before-seen stories. At the same time they reveal the stark limitations of the tech, undercutting the argument that chatbots are intelligent in a way similar to humans by suggesting that they do not have an internalized understanding of the world around them.

"There is nothing in there that tells the model that whatever it's saying should be actually correct in the world," said Ece Kamar, a senior researcher at Microsoft. The model itself also trains on a set amount of data, so anything that happens after the training is done doesn't factor into its knowledge of the world, Kamar said.

Hallucinations are not new. They've been an inherent problem of large language models since their inception several years ago, but other problems such as the AIs producing nonsensical or repetitive answers were seen as bigger issues. Once those were largely solved, though, hallucinations have now become a key focus for the AI community.

Potsawee Manakul was playing around with ChatGPT when he asked it for some simple facts about tennis legend Roger Federer. It's a straightforward request, easy for a human to look up on Google or Wikipedia in seconds, but the bot kept giving contradictory answers.

"Sometimes it says he won Wimbledon five times; sometimes it says he won Wimbledon eight times," Manakul, an AI researcher at the University of Cambridge and ardent tennis fan, said in an interview. (The correct answer is eight.)

Manakul and a group of other Cambridge researchers released a paper in March suggesting a system they called "SelfCheckGPT" that would ask the same bot a question multiple times, then tell it to compare the different answers. If the answers were consistent, it was likely the facts were correct, but if they were different, they could be flagged as probably containing made-up information.

When humans are asked to write a poem, they know it's not necessarily important to be factually correct. But when asking them for biographical details about a real person, they automatically know their answer should be rooted in reality. Because chatbots are simply predicting what word or idea comes next in a string of text, they don't yet have that contextual understanding of the question.

"It doesn't have the concept of whether it should be more creative or if it should be less creative," Manakul said. Using their method, the researchers showed that they could eliminate factually incorrect answers and even rank answers based on how factual they were.

It's likely a whole new method of AI learning that hasn't been invented yet will be necessary, Manakul said. Only by building systems on top of the language model can the problem really be mitigated.

"Because it blends information from lots of things it will generate something that looks plausible," he said. "But whether it's factual or not, that's the issue."

That's essentially what the leading companies are already doing. When Google generates search results using its chatbot technology, it also runs a regular search in parallel, then compares whether the bot's answer and the traditional search results match. If they don't, the AI answer won't even show up. The company has tweaked its bot to be less creative, meaning it's not very good at writing poems or having interesting conversations, but is less likely to lie.

By limiting its search-bot to corroborating existing search results, the company has been able to cut down on hallucinations and inaccuracies, said Google spokeswoman Jennifer Rodstrom. A spokesperson for OpenAI pointed to a paper the company had produced where it showed how its latest model, GPT4, produced fewer hallucinations than previous versions.

Companies are also spending time and money improving their models by testing them with real people. A technique called reinforcement learning with human feedback, where human testers manually improve a bot's answers and then feed them back into the system to improve it, is widely credited with making ChatGPT so much better than chatbots that came before it. A popular approach is to connect chatbots up to databases of factual or more trustworthy information, such as Wikipedia, Google search or bespoke collections of academic articles or business documents.

Some leading AI researchers say hallucinations should be embraced. After all, humans have bad memories as well and have been shown to fill-in the gaps in their own recollections without realizing it.

"We'll improve on it but we'll never get rid of it," Geoffrey Hinton, whose decades of research helped lay the foundation for the current crop of AI chatbots, said of the hallucinations problem. He worked at Google until recently, when he quit to speak more publicly about his concerns that the technology may get out of human control. "We'll always be like that and they'll always be like that."

WASHINGTON POST