GitHub Artifacts Help AI Understand Why Code Exists

TL;DR

A new system feeds AI models GitHub discussions and commit history so they grasp code intent, cutting hallucinations and boosting developer productivity.

Understanding complex software code is a major challenge for developers maintaining, modernizing, or learning existing systems. While artificial intelligence tools can describe what code does, they often miss the crucial context of why it was written in the first place. A new approach developed by IBM Research addresses this gap by leveraging the rich natural language artifacts found in GitHub repositories—pull request discussions, commit messages, and issue descriptions—to give AI systems deeper insight into code purpose and history.

The key finding is that combining GitHub's contextual information with large language models (LLMs) produces more meaningful code explanations that help developers understand not just functionality, but the rationale behind implementation decisions. The system successfully generates insights about why certain code exists, what bugs it addresses, and how it fits into the broader application architecture—information typically unavailable from examining code alone.

The methodology involves three integrated components working sequentially. First, a Context Builder extracts and organizes relevant GitHub artifacts for a given code snippet using Git history tracing and GitHub's GraphQL API. It filters out trivial commits (like comment changes or simple renames) and structures the remaining pull requests, issues, and commit messages hierarchically to preserve relationships. Second, a Summarizer LLM uses this organized context to generate high-level explanations of the code's purpose within the application. Third, a novel LLM-as-a-Judge (LaaJ) validator assesses explanation quality, checking for well-formedness and filtering out hallucinated claims not supported by the provided context.

Results from a user experience evaluation involving six repositories—both open-source projects and proprietary codebases—show the system's practical effectiveness. Developers consistently found the context-enhanced explanations provided valuable insights unavailable from code-only analysis. In one example involving security group rules, the system correctly identified that code was added to lint syntactically redundant rules, providing the rationale behind implementation that wouldn't be apparent from examining the function alone. Runtime performance remained practical, with most executions completing in under 20 seconds, even for code with extensive history.

The system's real-world significance lies in addressing critical software engineering challenges. For developers working with unfamiliar legacy code, it acts as a virtual mentor, providing contextualized explanations similar to those from experienced colleagues. During team onboarding, it helps new members quickly grasp complex systems without manually sifting through years of development history. The approach also helps prevent regression errors by preserving understanding of historical architectural motivations, especially important when implementations become non-intuitive after accumulated patches and evolving requirements.

Limitations noted in the research include the inherent subjectivity of code explanation evaluation, as human-authored references may incorporate information not present in associated artifacts. The system may occasionally overemphasize aspects of pull requests not central to the code's core purpose. Additionally, while generally scalable, performance can slow with extremely large repositories—in one case involving 20 years of history with 56 commits, 38 nontrivial changes, 36 pull requests, and 95 linked issues, runtime exceeded one minute due to less-performant enterprise GitHub servers.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn