AI Agents Can Hide Attacks Inside Normal AI Traffic

TL;DR

A new study shows how AI attack tools blend into legitimate AI service traffic, making them nearly invisible while compromising networks in minutes.

A new study demonstrates how artificial intelligence can be used to create stealthy cyberattack systems that operate undetected by hiding within the normal traffic patterns of legitimate AI services. Researchers from MIT and Vectra AI have developed an autonomous red teaming framework that repurposes the Model Context Protocol (MCP)—a standard for AI model communication—as a covert command-and-control channel. This approach allows AI agents to conduct penetration testing operations while appearing as benign AI application traffic, fundamentally changing how network defenders must approach threat detection.

The researchers found that their MCP-enabled system achieved domain dominance on a simulated corporate network in under 30 minutes with no human intervention, compared to days required for traditional manual operations. The system uses a decoupled, two-leg communication architecture where agents first receive high-level tasks through MCP traffic that resembles normal AI service communications, then communicate directly with public LLM APIs like Anthropic's Claude for planning and payload generation. This design eliminates the periodic beaconing patterns that traditional command-and-control systems rely on, which are easily detectable by network monitoring tools. In experimental comparisons, the AI-driven approach required only a single high-level directive versus over 200 individual commands needed for manual operations, dramatically reducing human effort while increasing operational speed.

Ology involved developing a multi-agent system with three main components: reconnaissance agents deployed on target machines, an MCP coordination server in the cloud, and a red team command agent built on the RedTeamLLM framework with over 18,500 lines of new code. The reconnaissance agents use a 'living off the land' approach, employing existing system tools like PowerShell and WMI on Windows or native commands on Linux to avoid detection by endpoint protection systems. These agents communicate asynchronously with the MCP server only when needing tasks or reporting , creating event-driven patterns that mimic legitimate developer interactions with AI services. The system was tested in the Vectra AI Cyber-Range Lab, a controlled environment simulating enterprise networks with full defensive stacks including network detection and response systems, SIEM platforms, and endpoint detection and response tools from CrowdStrike and SentinelOne.

From the case study showed the system's remarkable stealth capabilities. Network traffic analysis revealed that while traditional C2 systems like Cobalt Strike produce regular, predictable beaconing patterns (visible as rhythmic spikes in Figure 3), the MCP-enabled agents generated irregular, bursty communication that blended with legitimate AI service traffic. The agents achieved zero detections from Microsoft Defender for Endpoint during an autonomous EDR evasion test, where they correctly identified security protections and attempted various attack techniques. The AI-generated assessment report (Figure 8) documented how the system recognized active tamper protection and AMSI blocking while recommending more sophisticated bypass techniques for future attempts. When using streaming communication with LLM APIs, the traffic patterns became behaviorally indistinguishable from legitimate enterprise AI tools like developer co-pilots, with erratic spikes in both directions that defeat traditional traffic analysis s.

Of this research are profound for both offensive and defensive cybersecurity. For legitimate security teams, the technology enables faster, more thorough penetration testing that can better simulate advanced persistent threats. The system's ability to coordinate multiple agents simultaneously—sharing intelligence through the MCP server's shared context database—allows for swarm-like operations where agents collaborate on lateral movement and exploit chaining. However, the dual-use nature of this technology presents significant risks, as the same capabilities could be weaponized by malicious actors to conduct sophisticated attacks with lower expertise requirements. The researchers emphasize that their experiments were conducted in controlled lab environments with explicit authorization, but acknowledge the potential for misuse if such systems fall into the wrong hands.

Limitations identified in the paper include the current reliance on cloud-based LLM APIs, which creates network traffic that could potentially be monitored despite its stealth characteristics. The researchers note that future work could embed smaller, specialized language models directly within agents to eliminate this external dependency entirely. Additionally, while the system demonstrated impressive autonomy, it still occasionally experienced hallucinations where the AI generated plausible but incorrect information, requiring robust error handling mechanisms. The framework also currently focuses primarily on reconnaissance and command-and-control phases, with fully autonomous exploitation and post-exploitation capabilities identified as areas for future development. Ethical considerations remain paramount, as the abstraction of complexity through natural language interfaces could potentially lower the barrier for malicious use, necessitating careful controls and responsible deployment practices.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn