AIResearch AIResearch
Back to articles
AI

AI Agents Team Up to Catch Fleeing Targets

New algorithm helps robotic pursuers coordinate more effectively by learning to form dynamic teams, cutting capture time by nearly 40% in simulations

AI Research
November 13, 2025
3 min read
AI Agents Team Up to Catch Fleeing Targets

In robotics and artificial intelligence, getting multiple agents to work together efficiently remains a major challenge—whether for search and rescue, security patrols, or even gaming. A new study introduces a method that helps robotic pursuers dynamically organize into teams to capture grouped evaders, significantly speeding up the process. This approach could enhance real-world applications where coordinated movement is key, from autonomous drones to multi-robot systems.

The researchers developed a novel algorithm that combines a Self-Organizing Feature Map (SOFM)—a type of artificial neural network—with a reinforcement learning technique and an Agent Group Role Membership Function (AGRMF). This system allows pursuers to form and adjust groups based on their proximity to evaders and their own capabilities, rather than operating independently. The key finding is that this method reduces the average time needed to capture all evaders by 39.1% compared to previous approaches that lack such dynamic grouping.

To test their method, the team used a grid-based simulation environment with 100 cells, 33 pursuers, and 9 evaders. Each agent moved at a speed of one grid cell per iteration, with actions including moving up, down, left, or right. The SOFM was applied to cluster pursuers into groups targeting similar evaders, while the AGRMF assigned roles based on factors like distance and agent confidence. Reinforcement learning then optimized each pursuer's path by calculating immediate rewards, encouraging them to surround evaders rather than focus on individual targets. This setup mimics real-world scenarios where agents must adapt to changing conditions.

The results, detailed in figures from the paper, show that the SOFM+AGRMF method required an average of only 47.64 iterations to capture all evaders, compared to 78.27 iterations for methods using AGRMF alone and 137.71 for basic AGR without grouping. Additionally, the degree of reorganization—how often pursuers adjust their targets—nearly doubled to 43.54, indicating greater flexibility. In clustering performance, the method outperformed standard techniques like KMEANS (61.6 iterations) and DBSCAN (69.98 iterations), with a 22.66% improvement over KMEANS. These figures highlight the algorithm's efficiency in both speed and adaptability.

This advancement matters because it addresses a common limitation in multi-agent systems: pursuers often act too independently, slowing down tasks like capturing evaders in crowded or dynamic environments. By enabling real-time team formation and reorganization, the method could improve applications in areas such as disaster response, where robots must quickly coordinate to locate survivors, or in security, where autonomous systems patrol large areas. The use of reward-based learning also makes the approach scalable to various scenarios without requiring extensive pre-programming.

However, the study notes limitations. The algorithm performs well with randomly placed evaders but is less effective when agents are too isolated, such as in corners of the grid, where capture times increase. Future work could tackle this issue and explore incorporating limited-range sensors or variable agent speeds to better mimic real-world constraints, potentially broadening the method's applicability.

Original Source

Read the complete research paper

View on arXiv

About the Author

Guilherme A.

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn