AI Now Decides When to Add Search Details Itself

TL;DR

A new method teaches AI to expand queries only when needed, improving accuracy and cutting processing time by nearly half in some cases.

In the world of AI-powered search, adding extra details to queries can improve results—but it often slows systems down and sometimes makes them less accurate. Researchers have developed a new approach that teaches AI when to add these details and when to keep queries simple, achieving better performance with significantly faster response times.

The key finding is that M-Solomon, a new AI system, can automatically determine which queries benefit from additional information and generate appropriate augmentations only when needed. This adaptive approach outperformed systems that always add details or never add details across multiple benchmark tests.

Methodology: The researchers divided datasets into two categories—those where queries benefit from augmentation and those where they don't—based on analysis of 20 different test sets. They used a powerful multimodal large language model called Qwen2.5-VL-72B-Instruct to generate answer-style augmentations for queries that need them. The system learns both when to generate augmentations and how to create meaningful representations of the augmented queries.

Results analysis: M-Solomon achieved an overall Precision@1 score of 67.3 on the MMEB benchmark, surpassing the NoAug baseline (62.4) and AlwaysAug baseline (64.7). More importantly, it reduced latency to 771 milliseconds per query compared to AlwaysAug's 1320 milliseconds—nearly twice as fast. The system selected augmentation for 51.6% of queries, showing balanced decision-making rather than random selection. On out-of-distribution tests, M-Solomon scored 67.0 compared to AlwaysAug's 66.0, demonstrating better generalization.

Context: This matters because AI systems that can efficiently retrieve relevant documents, images, or information are crucial for everything from web search to scientific research. The ability to add context only when needed means faster response times for users and more accurate results across different types of queries.

Limitations: The approach currently operates at the dataset level rather than making fine-grained decisions for individual queries. The paper notes that future work should explore methods to identify augmentation needs at the query level and extend the adaptive approach to reasoning-based query augmentation for more complex tasks.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn