When groups of AI agents powered by large language models (LLMs) reach consensus on a decision, it's easy to assume they're engaging in sophisticated collective reasoning. However, new research shows that this agreement can be largely a matter of chance, driven by a process the authors term 'memetic drift.' In a study published on arXiv, researchers from Harvard University and NTT Research introduce a minimal model called Quantized Simplex Gossip (QSG) to dissect how multi-agent systems form conventions. Their reveal that even when no individual agent has any prior preference for an outcome, populations can rapidly converge on a single choice simply through the stochastic sampling of each other's outputs. This insight has significant for deploying AI systems in high-stakes areas like law, finance, and healthcare, where understanding whether consensus reflects bias or randomness is critical.
The core is that symmetry breaking and consensus in neutral settings—where agents start with no inherent bias—are driven by mutual in-context learning. In this process, agents update their internal beliefs based on sampled outputs from other agents, creating a feedback loop where an arbitrary early choice can be amplified into population-wide agreement. The researchers formalize this as memetic drift, analogous to neutral evolution in biology, where outcomes are determined by stochastic sampling rather than systematic selection. Through experiments with LLM populations in naming games, they demonstrated that groups can reach consensus even without any external rewards or ground truth, as shown in Figure 1 where mean coordination increases over time despite initial neutrality. This s the notion that agreement in multi-agent systems necessarily indicates collective intelligence or information aggregation.
To investigate this phenomenon, the team developed the Quantized Simplex Gossip (QSG) model, a tractable framework that captures continuous internal beliefs, quantized communication, and in-context adaptation. In QSG, each agent maintains a probability distribution over possible labels, and interactions involve a speaker sampling a discrete message from their distribution and a listener updating toward that message with an adaptation rate α. The model includes three communication regimes: Hard (transmitting a single token), Top-m (transmitting an empirical distribution from m samples), and Soft (transmitting the full distribution). By varying parameters like population size N, bandwidth m, and adaptation rate α, the researchers could isolate the effects of sampling noise. The protocol, illustrated in Figure 4, uses random speaker-listener pairs and updates only the listener, mimicking real-world LLM interactions where agents learn from each other's outputs without direct reward signals.
, Validated through both QSG simulations and LLM experiments, reveal clear scaling laws for drift-induced polarization. For instance, Theorem 1 shows that Hard sampling injects an extra variance term that drives symmetry breaking, with early drift in polarization U decreasing as 1/N², as confirmed in Figure 8b and 8e for GPT-4o and Claude Haiku 4.5. Similarly, Theorem 2 demonstrates that increasing communication bandwidth m reduces drift linearly as 1/m, with Figure 9 showing this trend in GPT-4o experiments. The researchers also identified a crossover from a drift-dominated regime, where consensus is effectively a lottery, to a selection regime, where weak biases are amplified. Figure 3 illustrates this crossover, with fixation probability varying with population size and a derived parameter ΓT marking the transition. In practical terms, larger populations and higher-bandwidth communication suppress drift, while stronger adaptation speeds dynamics but can make biases less decisive.
These have profound for the design and evaluation of multi-agent AI systems. If consensus can arise from random sampling rather than reasoned deliberation, it raises questions about the reliability of such systems in consequential applications. The study suggests that agreement alone is not evidence of collective intelligence, and without a baseline like memetic drift, it's hard to distinguish amplified noise from meaningful coordination. This is particularly relevant for alignment and safety, as harmful collective representations could form through interaction, even if individual agents are aligned. The framework also opens avenues for a physics-style analysis of social representation formation, moving beyond descriptive studies to predictive scaling laws. However, the researchers caution that QSG is a minimal model, akin to an ideal gas law, and real-world systems involve complexities like structured networks and heterogeneous agents that were not addressed here.
Despite its insights, the study has limitations. QSG assumes well-mixed pair selection and first-order adaptation, which may not fully capture the nuances of real LLM interactions. The model focuses on neutral settings without external biases, whereas practical applications often involve prior data or rewards that act as selection forces. Additionally, the empirical validation relied on synthetic naming games with limited label sets and fixed referents, which may not generalize to more complex tasks. The researchers note that their analysis is primarily expectation-based and does not prove finite-time consensus for adaptation rates α < 1, treating run-level fixation as an empirical phenomenon. Future work will need to extend the framework to include network structures, agent heterogeneity, and training-data priors to better understand what populations converge on and why, not just whether they coordinate.
Original Source
Read the complete research paper
About the Author
Guilherme A.
Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.
Connect on LinkedIn