Large language models (LLMs) like GPT and Mistral power everything from chatbots to search engines, but they often contain outdated or incorrect facts. Fixing these errors typically requires expensive and time-consuming retraining, limiting their real-time accuracy and safety. A new study introduces a smarter way to edit knowledge in these models, making updates more precise and reliable for everyday applications.
The key finding is that LLMs store factual knowledge not just in Multi-Layer Perceptron (MLP) modules, as previously thought, but also in attention (Attn) modules, which are crucial for processing relationships in data. By editing both types of modules together, researchers achieved higher success rates in correcting facts and better generalization to new contexts. This approach, called IntAttn-Edit, reduces 'knowledge residuals'—leftover errors from incomplete updates—leading to more trustworthy AI systems.
Methodology involved causal tracing experiments on models like Qwen2.5-7B to identify where knowledge is stored. The researchers measured how much each module contributes to factual recall by comparing clean and corrupted data runs. They found that Attn modules play a significant role in early processing stages, acting as semantic filters. IntAttn-Edit then uses a balancing strategy to allocate update magnitudes between MLP and Attn modules based on their measured contributions, ensuring neither is overlooked.
Results from benchmarks like ZsRE and WikiData Counterfact show IntAttn-Edit outperforms existing methods. For example, on ZsRE with 100 edits, it achieved up to 96.87% edit success, compared to 95.43% for the next best method, while maintaining high portability (up to 56.32%) and locality (up to 34.89%), meaning it corrects facts accurately without harming unrelated knowledge. Figure 4 in the paper illustrates that moderate balancing ratios optimize performance, avoiding declines in key metrics when over-relying on one module.
In practical terms, this means AI models can be updated more efficiently to reflect new information, such as current events or corrected data, without the high costs of retraining. This could improve applications in education, customer service, and research, where accuracy is critical. However, the study notes limitations, including the challenge of handling multi-edit conflicts and the need for further testing on larger-scale models and diverse datasets to ensure broad applicability.
Original Source
Read the complete research paper
About the Author
Guilherme A.
Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.
Connect on LinkedIn