Voice Assistant Teaches Coding in Any Language

TL;DR

A new AI tool lets beginners learn programming in their native language, reaching 75% accuracy and strong user satisfaction worldwide.

A new AI-powered voice assistant is transforming how students learn to code by allowing them to ask questions in their native languages, rather than relying on English proficiency. This innovation addresses a critical barrier in programming education, particularly in multilingual regions like India, where many students transition from regional language schooling to English-medium computer science curricula. By integrating speech recognition and code-aware processing, the system, called CodeVaani, offers on-demand support that scales to many learners, making coding more inclusive and accessible for beginners who struggle with text-based interfaces.

The researchers found that CodeVaani achieved a 75% response accuracy in a study with 28 beginner programmers, based on a strict criterion where only exact and fully aligned answers were counted as correct. Over 80% of participants rated their experience as satisfactory or above, with 10.7% calling it excellent and 32.1% good. In a survey, 14 out of 26 participants said they would definitely use the system in their programming courses, and 11 said they probably would, indicating strong interest in adopting this voice-based, multilingual tool for educational support. This performance demonstrates the system's potential as a reliable AI-enabled assistant that can handle diverse linguistic needs in real-world learning environments.

To build CodeVaani, the team developed a three-stage architecture that processes student queries through speech recognition, code-aware transcription refinement, and query response generation. They used Automatic Speech Recognition (ASR) models, including Whisper for English and Indic-Conformer from AI4Bharat for Indic languages, to transcribe spoken queries that often mix native language utterances with English code terms. A code-aware transcription refinement stage employed the instruction-tuned gemma-27B model to correct errors, such as misrecognized variable names or symbolic operators, by leveraging programming semantics. Finally, the refined transcription was passed to the Codestral-22B code model to generate relevant responses, delivered in both text and audio through a ReactJS front end and Django backend integrated into the Bodhitree Learning Management System.

The data from the evaluation shows that CodeVaani successfully provided correct responses in 75% of cases, with 72 exact responses out of 96 queries, as documented in participant journals. The system handled queries in multiple Indian languages, including Hindi, Marathi, Gujarati, Tamil, Telugu, Bengali, Malayalam, Kannada, and Odia, reflecting its broad applicability. Compared to traditional classroom assistance, the framework offers advantages like availability beyond class hours, scalability to support many students simultaneously, and multilingual support that bridges gaps for non-English speakers. These highlight how voice-based interfaces can lower entry barriers, as evidenced by the positive usability survey where participants reported high satisfaction and a willingness to integrate the tool into their courses.

This work matters because it addresses a significant equity issue in technology education, where English proficiency and typing skills often exclude learners from multilingual backgrounds. By enabling voice interaction in native languages, CodeVaani makes programming concepts more accessible, potentially increasing participation and success in computer science fields globally. The system's on-demand availability and scalability complement traditional teaching s, offering a practical solution for regions with limited resources or large class sizes. Future could include expanded support for more languages and enhanced features, helping to democratize coding education and foster a more diverse tech workforce.

Despite its successes, the study acknowledges limitations, such as the need for further improvements in ASR accuracy for code-mixed speech and the system's current inability to handle multi-turn conversations for interactive dialogue. The researchers note that future work will focus on fine-tuning ASR and transcription models on larger datasets, merging error correction and response generation into a unified pipeline to reduce latency, and adding features like typed input in multiple languages. These steps aim to enhance real-world performance and user experience, ensuring that CodeVaani can evolve to meet the growing demands of inclusive educational technology.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn