AI Can Now Handle Expert Disagreements

Artificial intelligence systems often struggle when experts disagree on how a problem should be modeled. A new approach helps AI make robust decisions even when facing multiple conflicting opinions about how the world works. This breakthrough addresses a fundamental challenge in deploying AI for real-world applications like robotics and healthcare, where different domain experts might have competing views on environmental dynamics.

The researchers developed Multi-Environment POMDPs (ME-POMDPs), which extend traditional partially observable Markov decision processes to handle discrete model uncertainty. These models represent situations where multiple experts provide different transition, observation, or reward functions for the same problem. The goal is to find a single policy that performs well across all possible environments, maximizing the worst-case expected reward.

To solve these complex problems, the team introduced adversarial-belief POMDPs (AB-POMDPs), which they showed are equivalent to one-sided partially observable stochastic games. They proved that any ME-POMDP can be reduced to a simpler version where environments differ only in transitions (creating PO-MEMDPs) or only in observations (creating MO-POMDPs), while preserving optimal policies. The researchers developed exact and approximate algorithms, including AB-HSVI, which combines heuristic search value iteration with linear programming to compute robust policies.

Experimental results on benchmark problems demonstrated the approach's effectiveness. In the Bird preservation problem, inspired by endangered species management, the method handled scenarios where experts disagreed about population dynamics. The RockSample benchmark showed that robust policies achieved values close to individual environment optimals while significantly outperforming policies designed for incorrect environment assumptions. The computational cost increased with the number of environments, but the approach remained practical for moderate problem sizes.

This work matters because real-world AI deployment often involves uncertainty about which expert's model is correct. In healthcare, different doctors might disagree about disease progression. In robotics, engineers might have competing views about environmental dynamics. The new method ensures AI systems won't fail catastrophically if they're operating under the wrong assumptions about their environment.

The main limitation is scalability - computation time increases substantially with more environments. Future work will focus on developing more efficient algorithms and exploring policy-gradient methods for larger problems. The theoretical foundation established here provides a crucial stepping stone toward AI systems that can reliably handle real-world uncertainty and expert disagreement.

AI Can Now Handle Expert Disagreements

About the Author

Guilherme A.