AI Hides Secret Messages in Plain Text

A new artificial intelligence system can conceal secret messages within ordinary-looking text, tripling the amount of hidden information while cutting transmission time in half. This breakthrough in linguistic steganography—the art of hiding messages in innocent-looking carriers—addresses a critical limitation that has plagued covert communication methods for years.

The researchers developed RTMStega, a method that integrates rank-based coding with context-aware decompression and normalized entropy. Unlike traditional approaches that either modify existing text or rely on large databases of pre-written messages, this system generates natural-looking text from scratch while embedding secret information. The key innovation lies in using language models to compress messages into ranked sequences, then reconstructing them as coherent text that appears completely ordinary.

RTMStega works by first converting secret messages into compressed binary sequences using a ranking system based on language model predictions. The system then embeds these sequences into generated text by dynamically adjusting between high-entropy and low-entropy sampling. When the language model's predictions show sufficient variety (high entropy), the system embeds information by selecting specific words based on their ranking. When predictions are too predictable (low entropy), it generates text normally to maintain natural flow. This dual approach ensures the final text reads naturally while containing hidden information.

Experimental results across multiple datasets—including AGNews business articles, IMDb movie reviews, and WikiQA questions—show RTMStega achieves three times the payload capacity of previous methods while reducing embedding time by over 50%. The system maintained text quality comparable to normal AI-generated content, with perplexity scores (a measure of text naturalness) remaining stable. Security testing against three advanced detection tools showed the hidden messages remained effectively undetectable, with detection accuracy hovering around 50%—no better than random guessing.

This advancement matters because traditional steganography methods face a fundamental trade-off: either they can hide only small amounts of information, or they produce text that appears unnatural and raises suspicion. Current modification-based approaches, like synonym substitution or spelling changes, often introduce detectable patterns. Retrieval-based methods that pull from large text databases avoid modification but require both parties to share the same database and still struggle with capacity limitations. RTMStega solves both problems by generating fresh, context-appropriate text that naturally contains the hidden message.

The practical implications are significant for secure communications where encryption alone might attract attention. Unlike encrypted messages that appear as scrambled text, steganographic messages blend seamlessly into everyday communications like emails, social media posts, or instant messages. This makes them ideal for situations where the mere existence of secret communication must remain hidden. The method's efficiency also makes it practical for real-world use, as it requires only single-round communication between parties.

However, the approach does have limitations. The security guarantees, while strong in practice, haven't been mathematically proven like some previous methods. The system also depends on both parties using identical language models and contexts, which could pose challenges in some deployment scenarios. Additionally, while the method resists current detection tools, future advances in steganalysis might require further refinements to maintain security.

The researchers used two open-source language models—Qwen2.5 and DeepSeek-R1-Distill-Llama-8B—in their experiments, demonstrating the method works across different AI systems. They optimized the balance between capacity and text quality by setting the entropy threshold at 0.6 and the bit parameter at 3, finding this combination produced text that appeared as natural as randomly sampled content while maximizing hidden information.

This work represents a substantial step forward in making covert communication both practical and secure. By leveraging the natural language generation capabilities of modern AI while addressing the capacity limitations that have constrained previous approaches, RTMStega opens new possibilities for privacy protection in digital communications.

AI Hides Secret Messages in Plain Text

About the Author

Guilherme A.