AI Translates Without Distorting Facts

A new AI system helps computers answer questions in multiple languages without making up false information, a common problem in today's translation tools. This breakthrough is crucial for improving accuracy in multilingual searches, from academic research to customer support, where reliable cross-lingual data retrieval is essential.

Researchers developed QTT-RAG, a multilingual retrieval-augmented generation system that evaluates and tags the quality of translated documents instead of altering them. Unlike previous methods that risk distorting content, this approach attaches quality scores for semantic equivalence, grammatical accuracy, and naturalness, allowing the AI to prioritize high-quality information.

The method works by retrieving documents, translating non-matching ones using a neural model like NLLB-200, and then assessing each translation on a 0.0 to 5.0 scale. These scores are added as metadata, guiding the AI to rely more on accurate translations and use lower-quality ones cautiously. The system was tested with large language models ranging from 2.4 billion to 14 billion parameters, including Exaone and Llama variants.

Results from open-domain question-answering benchmarks (XOR-TyDi and MKQA) show QTT-RAG outperforms baselines, with character-level recall improvements up to 6.8% for languages like Korean and Finnish. For instance, in Korean tests, it achieved scores like 43.8% recall with Aya-Expanse-8B, compared to 40.7% for a baseline. The system reduces errors such as entity hallucinations, where previous methods added unsupported details—like falsely stating a death date—in 11.5% of cases analyzed.

This innovation matters because it enables more trustworthy AI interactions in low-resource languages, supporting applications in education, journalism, and global services where data integrity is vital. By avoiding content rewriting, it preserves original meanings, helping users in diverse linguistic contexts access reliable information without compromising accuracy.

Limitations include reduced effectiveness when most documents match the query language, as in Chinese tests where gains were smaller due to fewer cross-lingual opportunities. The system also depends on the AI's ability to interpret quality tags, which may vary with model capabilities. Future work aims to expand evaluations to more languages and explore hybrid strategies for broader applicability.

AI Translates Without Distorting Facts

About the Author

Guilherme A.