A new artificial intelligence system can answer complex religious questions with unprecedented accuracy while safely rejecting queries outside its knowledge domain. FARSIQA, developed by researchers at Sharif University of Technology and Iran University of Science and Technology, addresses a critical challenge in sensitive domains where factual accuracy and reliability are paramount.
The system achieves a remarkable 74.3% correctness rate on complex multi-hop questions about Islamic topics, substantially outperforming standard question-answering approaches. Most notably, it demonstrates a dramatic 40-point improvement in negative rejection—the ability to correctly refuse to answer questions outside its scope—achieving 97% accuracy compared to 57% for baseline systems. This capability is particularly crucial for religious contexts where providing incorrect or unsubstantiated information could lead to misinformation.
FARSIQA builds on an innovative architecture called FAIR-RAG (Faithful, Adaptive, Iterative Retrieval-Augmented Generation), which moves beyond conventional single-pass approaches. Unlike standard systems that retrieve information once and generate answers, FAIR-RAG employs a dynamic, self-correcting process. The system adaptively decomposes complex queries into simpler sub-questions, critically assesses whether retrieved evidence is sufficient, and enters targeted refinement loops when needed. This iterative approach progressively builds context until comprehensive information is gathered.
The system operates on a massive knowledge base containing approximately 1.7 million text chunks drawn from 11 reputable online encyclopedias and question-answer platforms focused on Islamic topics. Researchers developed a domain-fine-tuned retriever that shows 16% improvement in recall compared to baseline models, enabling more effective information retrieval for specialized religious content.
Experimental results demonstrate the system's robust performance across multiple dimensions. Beyond the impressive correctness and rejection metrics, FARSIQA shows 11.8% improvement in answer relevance and 5.4% improvement in context relevance compared to standard approaches. The system also maintains strong performance when faced with noisy contexts containing distracting information, achieving 16-point improvement in noise robustness.
The iterative refinement process proves crucial to the system's success. Analysis shows that moving from a single-pass approach to three iterations produces dramatic quality improvements, with 80.1% of questions receiving definitively better answers through the iterative process. However, the research also identifies diminishing returns beyond three iterations, establishing the optimal configuration for balancing quality and efficiency.
For real-world applications, the system incorporates important safeguards. It includes explicit ethical boundaries, refusing to process questions that conflict with general ethical guidelines or promote harmful content. When queries touch upon sensitive or disputed theological issues, the system presents different viewpoints neutrally without endorsing any particular perspective. Crucially, the system explicitly avoids issuing religious rulings (fatwas), instead advising users to consult qualified religious authorities for definitive legal opinions.
The research acknowledges several limitations, including the system's stateless nature that prevents understanding conversational context across multiple interactions. The knowledge base, while extensive, does not encompass the entirety of Islamic scholarly texts, and the current corpus predominantly features texts from the Shi'a school of thought. The multi-step process also introduces latency, averaging 22.1 seconds per query in the optimal configuration.
This work represents a significant step toward creating AI systems that can handle high-stakes, nuanced domains where faithfulness and accuracy are critical. Beyond its specific application to Persian Islamic question-answering, the FAIR-RAG framework provides a blueprint for developing responsible AI systems in other sensitive domains like law and medicine, where factual reliability and proper scope handling are equally essential.
About the Author
Guilherme A.
Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.
Connect on LinkedIn