AIResearch AIResearch
Back to articles
Coding

AI Agents Cut Cloud Costs Automatically

Managing cloud spending has become a critical challenge for modern enterprises, with 63% of organizations now prioritizing cost management, up from 31% the previous year. This surge reflects the compl…

AI Research
November 14, 2025
3 min read
AI Agents Cut Cloud Costs Automatically

Managing cloud spending has become a critical challenge for modern enterprises, with 63% of organizations now prioritizing cost management, up from 31% the previous year. This surge reflects the complexity of handling diverse data formats, taxonomies, and metrics from multiple providers, which can delay decisions and lead to significant financial impacts. To address this, researchers have developed an autonomous AI system that simulates real-world cost optimization tasks, enabling businesses to reduce expenses without manual intervention.

The key finding is that AI agents can understand, plan, and execute cost-saving actions similar to human practitioners. In experiments, the system used large language models (LLMs) to achieve a 100% plan task completion rate and 100% data consolidation accuracy, matching expert performance. For instance, when tasked with reviewing pending recommendations for infrastructure without increasing budget, the agent successfully identified anomalies, analyzed commitments, and generated optimization strategies.

Methodologically, the researchers built a multi-agent framework leveraging GraphQL to unify data from disparate sources like Turbonomic and Apptio. This approach abstracts vendor-specific APIs into a single schema, allowing the AI to issue precise queries without over-fetching data. The system employs a reasoning loop where agents interpret natural language, decompose tasks into steps, retrieve relevant information, and synthesize insights. Specialized agents handle planning, retrieval, and analysis, collaborating to address different aspects of cost optimization, such as detecting underutilized resources or anomalous spending spikes.

Results from testing five LLMs show that proprietary models like GPT-4o excelled, with 76% accuracy in plan recognition and 90% in retrieval tasks, completing workflows in just 6-7 iterations. In contrast, open-source models like Llama-3-405B struggled, achieving only 55% plan recognition and requiring up to 18 iterations, highlighting that model performance depends on training and design rather than size alone. The system's efficiency is evident in its ability to recognize tools immediately in some cases, reducing latency and improving decision speed.

Contextually, this innovation matters because cloud costs are a top concern for enterprises, with many spending over $50 million annually. By automating FinOps processes, the system helps organizations make continuous, adaptive changes based on real-time insights, potentially saving millions. It bridges gaps in existing tools that often require manual oversight, offering a scalable solution for industries like banking and finance, where data silos and compliance are critical.

Limitations include the system's reliance on predefined schemas and mock data in experiments, which may not capture all real-world complexities. The paper notes that unexpected events, like traffic spikes, can challenge predictive models, and further testing is needed to ensure reliability in volatile environments. Future work will expand the agent's knowledge and tools to cover more tasks, enhancing its adaptability.

About the Author

Guilherme A.

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn