AI Gives Meaning to Machine-Generated Code

Logic programming, a powerful tool for modeling complex systems, often produces code with cryptic, machine-invented names that hinder human understanding and reuse. Researchers have now developed a method using large language models (LLMs) to automatically assign meaningful names to these predicates, making programs more interpretable and accessible.

The key finding is that LLMs can effectively rename unnamed predicates in logic theories, as demonstrated in hand-crafted examples. For instance, in a family relationship study, predicates like 'h0' were renamed to 'parent', and 'h1' to 'grandparent', based on the rules defining them. This approach leverages the semantic understanding of LLMs to suggest names that reflect the intended relationships, such as 'coauthors' for a predicate indicating researchers who authored a paper together.

Methodologically, the researchers employed a pipeline where multiple LLMs, including ChatGPT-4o, ChatGPT-o3mini, Gemini, and others, were prompted to suggest names for predicates in various logic programs. They used zero-shot prompting with carefully crafted instructions to generate and judge name suggestions, ensuring the models did not alter the program's structure. For example, in the math case study, predicates for operations like greater-than were renamed to 'greater_than' or similar intuitive terms.

Results from the evaluation show that models like ChatGPT-o3mini performed best, correctly renaming most predicates in studies such as family relationships and mathematical operations. In the family study, over 70% of predicates were accurately named, while in simpler cases like coauthors, success was nearly universal. However, challenges arose with more complex real-world datasets like Mutagenesis, where domain-specific knowledge gaps limited performance. The study also highlighted that iterative prompting and human judgment improved outcomes, with LLM-as-a-judge strategies aligning well with expert assessments in many scenarios.

This work matters because it addresses a critical bottleneck in declarative programming, where invented predicates can obscure program logic, complicating debugging and reuse. By automating renaming, the method enhances the practicality of logic programming in fields like bioinformatics and explainable AI, making it easier for scientists and developers to work with automated rule generation. It builds on advancements in generative AI, applying them to a niche but impactful problem in computer science.

Limitations include the reliance on examples that may not cover all real-world complexities, and the need for domain-specific fine-tuning in specialized applications. The paper notes that while LLMs show promise, further research is needed to handle ambiguous cases and improve consistency across different programming languages.

AI Gives Meaning to Machine-Generated Code

About the Author

Guilherme A.