The Benefits of Watching AI Systems Argue with Each Other

The Benefits of Watching AI Systems Argue with Each Other

Knowing an AI's thought process can help develop transparency and trust in AI systems.

If you want to know how someone thinks, you can get some clues by watching them argue.

That’s part of the idea behind a project run by OpenAI, a non-profit research operation backed by some of Silicon Valley’s biggest players, including Elon Musk, Peter Thiel, Microsoft and Amazon Web Services. OpenAI describes its mission as “discovering and enacting the path to safe artificial general intelligence.”

OpenAI researchers explain they are training AI agents to debate each other on given topics in the interest of developing systems that, while capable of exploring complex lines of logic and performing tasks beyond the reach of humans, would nevertheless do it with thought processes that correspond with those of human preferences.

The team calls it an “AI safety technique,” in line with the group’s focus on developing transparency and trust in AI systems. Understanding how AI thinks can help prevent those systems from doing something unexpected or unethical, and would support more responsible use in areas like cybersecurity, surveillance and even targeting.

That goal also fits with the objectives of government and other organizations, who have said their plans for AI are focused on human-machine teams, whether in military operations, medical settings or other areas as varied as air traffic control and power management. AI has already proved itself in a variety of areas, but a big part of the technology’s future use depends on whether AI can talk a good game, too.

What Were They Thinking?

A recent survey by Genpact underscores the importance of transparency and trust in moving forward with AI. Eighty-two percent of executives polled said they planned to use AI technologies within three years, and 79 percent of those making the most use of AI said they expect their employees to be comfortable working with robots by 2020. But 63 percent said that being able to understand an AI system’s thinking was important or critical to using the systems — a number that jumps to 88 percent among executives in AI companies.

The challenge with self-learning systems is that their speed and capacity to explore complex problems outstrips those of humans — that is why we want AI systems, after all — but they can sometimes go off on their own. As pointed out by MIT Technology Review, OpenAI researchers have found that AI agents can learn to “glitch” their way to higher scores in computer games. In the last couple of years, AI tools used by Google and Facebook caused a stir when they created their own internal language.

What makes those developments unsettling is that AI systems currently can’t explain, in human terms, how they reached a certain conclusion or went off in a certain direction. They did what they did, and that’s that. An emerging area of research, such as that being conducting by the Defense Advanced Research Projects Agency, is in Explainable AI (XAI), which would allow humans to understand a machine’s thinking, and trust it as a result.

Some More Explaining To Do

XAI is a ways off, however, at least partly because natural language processing isn’t yet up to the task. It works well in Q&A sessions with Apple’s Siri or Amazon’s Alexa, it but can’t manage an in-depth conversation. OpenAI’s researchers are starting with simpler games that sort of get to core values and could, “eventually help us train AI systems to perform far more cognitively advanced tasks than humans are capable of, while remaining in line with human preferences,” they write.

One example they give is of two agents, self-trained by the methods of the kind used by DeepMind’s AlphaGo Zero, being asked where a human should go on vacation. One says Alaska, the other Bali, after which they go back and forth, debating the weather, the need for a passport to go to Bali, although the human could get an expedited passport in two weeks, and so on, until the human has enough information to judge which agent has a better argument. Another example involved imagery, debating whether a sliver of a photo was of a dog or a cat based on a few pixels of an ear.

They’ve also created an adversarial game people can play, taking the parts of the agents and judge, in which one of the agents lies and the other tells the truth. To date, OpenAI said it found that the honest player usually wins, but researchers are interested in seeing if other players’ results varies.

Starting with fairly simple games can instill human preferences into the systems so that when debating complex subjects in which the agents have a better understanding than humans, a human judge can still make a ruling, the researchers said. Ultimately, the goal is to instill trust and honesty in AI systems, especially those that will be making or influencing decisions for humans.