AI Chat Privacy at Risk from Network Traffic Analysis

TL;DR

Researchers show encrypted AI chat traffic can reveal sensitive topics via side-channel attacks, putting users in restrictive regions at risk.

Large language models (LLMs) are increasingly used for sensitive tasks like healthcare and legal advice, where privacy is critical. A new study reveals that even encrypted communications can leak enough information to expose what users are discussing, threatening confidentiality in everyday AI interactions. This vulnerability affects major AI providers and could enable surveillance by internet service providers or governments.

Researchers discovered that analyzing the size and timing of data packets in encrypted AI chat streams allows an attacker to infer whether a conversation involves a sensitive topic, such as money laundering. The method, called Whisper Leak, achieves near-perfect classification accuracy for many popular LLMs, with high precision even when sensitive chats are rare—detecting them among 10,000 unrelated conversations with minimal false positives. For instance, the attack identified money laundering discussions with up to 100% precision at low recall rates, recovering 5-20% of such conversations in tests.

The team collected data from 28 commercial LLMs, including models from OpenAI, Google, and Microsoft, by simulating user queries and capturing network traffic. They trained machine learning classifiers—LightGBM, LSTM, and BERT-based models—on sequences of packet sizes and inter-arrival times to distinguish target topics from background noise. As shown in Figure 1, a passive adversary sniffs encrypted traffic without decrypting it, using these classifiers to flag sensitive content. The approach builds on how LLMs generate responses token-by-token in streaming mode, which inherently leaks patterns through metadata.

Results analysis indicates the attack is highly effective across providers, with median area under the precision-recall curve (AUPRC) often exceeding 98% for 17 out of 28 models tested. For example, models like mistral-large and microsoft-deepseek-r1 achieved AUPRCs above 99%, as detailed in Table 1. Figure 4 shows that attack performance improves with more training data, suggesting adversaries could enhance effectiveness over time. In real-world terms, this means an eavesdropper could reliably pinpoint discussions on regulated activities without accessing the chat content itself.

This finding matters because it undermines the privacy assurances of encrypted services, posing risks for users in oppressive regimes or those discussing confidential matters. For instance, someone seeking advice on political dissent or health issues could be identified and targeted based on traffic patterns alone. The study highlights a systemic issue in AI deployment, where streaming and autoregressive generation create unavoidable side channels.

Limitations from the paper note that the attack's precision in extreme imbalance scenarios is based on extrapolations and may vary with real-world traffic heterogeneity. Additionally, while mitigations like token batching, packet injection, and padding reduce effectiveness, none provide complete protection, leaving residual vulnerabilities as shown in Table 3. Future work could explore multi-turn conversations or improved adversarial methods, but the core risk remains due to fundamental architectural choices in LLMs.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn