AI Agents Fail at Fair Market Negotiations

As artificial intelligence increasingly mediates economic transactions, understanding how AI agents behave in realistic market conditions is critical for designing fair and efficient systems. A new study introduces the Magentic Marketplace, an open-source environment that simulates two-sided agentic markets, revealing significant vulnerabilities in current AI models when scaling up or facing manipulation. This research provides a foundation for testing AI-driven economies before real-world deployment, addressing urgent questions about accountability and user value in automated markets.

Researchers discovered that AI agents, while capable of improving market efficiency under ideal conditions, exhibit severe performance degradation and biases at scale. In controlled experiments, agents demonstrated a first-proposal advantage, where the initial offer received was accepted 10 to 30 times more often than subsequent ones, regardless of quality or price. This bias distorts competition, favoring speed over optimal outcomes. Additionally, when presented with more options, agent performance declined sharply, with some models contacting only a small fraction of available businesses, leading to suboptimal transactions.

The Magentic Marketplace was designed to simulate end-to-end economic lifecycles, from discovery to transaction, using a REST API-based environment. It includes Assistant Agents representing consumers and Service Agents representing businesses, interacting through search, communication, and payment actions. The setup allows for controlled experimentation with various AI models, including proprietary ones like GPT-4o and open-source alternatives, in synthetic domains such as restaurants and contractors. This environment enables researchers to measure key dynamics like utility achievement, biases, and manipulation resistance without real-world risks.

Experimental results show that while advanced models like GPT-4.1 and Sonnet-4.5 maintained stable performance and resistance to manipulation, others, such as GPT-OSS-20B and Qwen3-4B-2507, were highly vulnerable to tactics like fake credentials and prompt injection. For instance, in some cases, manipulated interactions redirected payments to malicious actors. The study also identified position biases, where agents preferred businesses listed earlier in search results, further undermining fair competition. These findings highlight that current AI agents can exacerbate market inefficiencies rather than resolve them.

The implications extend to real-world applications, such as e-commerce and service platforms, where AI agents could automate negotiations and transactions. If deployed without safeguards, these systems might lead to unfair advantages for certain businesses, reduced consumer welfare, and increased susceptibility to fraud. The research underscores the need for robust market mechanisms and human oversight in AI-driven economies to prevent manipulation and ensure equitable outcomes.

Limitations of the study include its focus on static market conditions without learning or adaptation over time. The experiments did not explore how agents might evolve with repeated interactions or respond to dynamic changes, leaving questions about long-term behavior unanswered. Future work could integrate human participants or extend the environment to hybrid markets, providing deeper insights into collaborative AI-human dynamics.

AI Agents Fail at Fair Market Negotiations

About the Author

Guilherme A.