AI Predicts Student Learning From Question Text

TL;DR

A new method uses language models to forecast student performance more accurately than older approaches, even for new students with little data.

Educational technology stands to gain significantly from more accurate prediction of student learning, enabling better personalized instruction. Researchers have developed a novel approach that uses the actual text of questions students answer, rather than just simplified labels, to predict their future performance with greater precision.

The key finding demonstrates that incorporating the full text of educational questions into AI models substantially improves their ability to predict whether students will answer future questions correctly. The method, called Next Token Knowledge Tracing (NTKT), reformulates the prediction task as a next-word prediction problem similar to how large language models generate text.

The methodology converts student interaction histories into natural language sequences, treating each student's question-answer pattern as text that can be processed by language models. Researchers used selective masking during training, focusing the model's learning only on the critical "correct" or "incorrect" tokens while still providing full contextual information from question text and student history.

Results show NTKT outperformed all established knowledge tracing baselines across multiple metrics. On the Eedi mathematics dataset containing nearly 2 million student interactions, the method achieved 90.20% F1 score, 86.70% accuracy, and 95.72% AUC - exceeding traditional approaches by significant margins. The system demonstrated particular strength in cold-start scenarios, maintaining high performance even for new students with limited interaction history.

The practical implications are substantial for educational platforms seeking to provide personalized learning experiences. By better predicting student struggles, systems can intervene earlier with targeted support. The approach's strong performance with new students means educational tools could become effective more quickly after deployment.

Limitations include the computational demands of large language models and the current restriction to mathematics content in English. The method assumes high-quality question text is available, which may not be the case in all educational contexts where content is proprietary or poorly structured.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn