AI Learns to Control Its Own Creativity

Large language models like ChatGPT have a hidden secret: despite their sophisticated training, they rely on manual tuning of simple parameters to control how creative or predictable their responses should be. This manual decoding process requires human experts to constantly adjust settings like temperature and top-p for different tasks, creating a bottleneck that limits true artificial intelligence. A new approach called AutoDeco solves this problem by enabling AI models to automatically adjust their own creativity levels in real-time.

The key finding from researchers at Tencent Lab and Chinese University of Hong Kong, Shenzhen is that language models can learn to predict their own optimal decoding parameters at each step of text generation. AutoDeco adds lightweight components to existing transformer models that predict context-specific temperature and top-p values alongside the next token. This transforms the traditionally manual, static decoding process into a dynamic, self-regulating system.

The methodology involves augmenting standard language models with simple neural network heads that learn to predict appropriate temperature and top-p values. During training, the researchers introduced a differentiable "soft" top-p mechanism that enables gradient flow through the entire system. This allows the model to learn optimal decoding strategies directly from the task objective, without requiring pre-computed labels for what constitutes good parameter settings. The approach adds minimal computational overhead—typically only 1-2% additional latency—making it practical for real-world applications.

Results across multiple benchmarks demonstrate AutoDeco's effectiveness. On mathematical reasoning tasks, AutoDeco consistently outperformed standard decoding methods. For example, with the Llama-Nemotron-8B model, AutoDeco achieved a Pass@1 accuracy of 46.05, representing a 3.5-point improvement over greedy search. More strikingly, the system matched or slightly surpassed performance achieved through exhaustive manual tuning by experts using test-set knowledge—an impractical scenario in real applications where test data is unknown.

The system also showed strong generalization capabilities. Despite being trained primarily on mathematical reasoning data, AutoDeco improved performance on diverse out-of-domain tasks including question answering, code generation, and instruction following. On the R1-Distill-Qwen-7B model, AutoDeco improved average performance across general tasks by 4.4 points compared to default sampling methods.

Perhaps the most significant implication is AutoDeco's emergent capability for instruction-based control. The researchers discovered that when prompted with commands like "generate with low randomness" or "ensure your answers are innovative and diverse," the model automatically adjusts its predicted parameters accordingly. After targeted training, this behavior became highly consistent, with the model reducing average temperature from 0.72 to 0.61 with 99% consistency when prompted for low-diversity outputs. This represents a fundamental shift from passive text generation to active participation where AI can interpret and act on user intent about desired output style.

Limitations noted in the paper include the preliminary nature of the instruction-based control capability. While the model learns to make directionally correct adjustments, the control isn't always precise. The researchers hypothesize that achieving fine-grained control may require jointly training the entire language model rather than just the added decoding heads. Additionally, comprehensive evaluation on very large-scale models wasn't performed due to computational constraints.

This research challenges the assumption that language models require broad, task-matched supervision for optimal performance. Instead, AutoDeco demonstrates that models can learn universal principles for balancing exploration and exploitation during generation, moving toward more efficient and intuitive human-AI interaction.

AI Learns to Control Its Own Creativity

About the Author

Guilherme A.