New Tools Help Make AI Systems Safer and More Reliable

TL;DR

Researchers are building practical methods to reduce bias, protect privacy, and improve AI safety in systems that affect everyday users.

As artificial intelligence systems increasingly influence healthcare, hiring, and daily decisions, ensuring they operate fairly and safely has become critical. A team of researchers has created a comprehensive software toolbox that applies established engineering practices to evaluate whether AI systems meet key trustworthiness guidelines set by the European Commission. This approach bridges the gap between high-level ethical principles and practical implementation, offering developers concrete methods to assess and improve AI behavior.

The researchers identified that trustworthy AI must fulfill seven requirements: human oversight, technical robustness and safety, privacy and data governance, transparency, diversity and non-discrimination, societal and environmental well-being, and accountability. They mapped each requirement to existing software engineering techniques, demonstrating how tools from testing and data management can be adapted to evaluate AI systems. For instance, differential testing compares multiple AI models performing the same task—if their outputs diverge significantly, it may indicate faulty behavior. This method, explored in tools like DeepXplore, helps identify inconsistencies without needing precise expected outcomes.

To assess technical robustness, the team highlighted metamorphic testing, which checks how AI outputs change when inputs are modified in specific ways. In one example, an AI resume-ranking system should improve a candidate's score when relevant keywords are added, even if the exact ranking isn't known beforehand. For privacy, tools like ARX anonymize datasets using methods such as k-anonymization (where personal data is generalized until each row matches at least k-1 others) and fuzzification, which perturbs quasi-identifying attributes like birth dates to minimize re-identification risks. In a cervical cancer screening study, fuzzifying exam dates by up to six months preserved data utility while protecting privacy.

The results show that these techniques can effectively address trustworthiness gaps. For diversity and non-discrimination, combinatorial interaction testing verified that datasets include adequate representation across attributes like ethnicity and gender, catching 95% of faults with small test sets covering pairwise interactions. The researchers also emphasized transparency through model cards—documentation similar to pharmaceutical package inserts—that explain an AI system's capabilities and limitations to developers and users. In societal well-being, evidence-based methods from epidemiology, such as randomized controlled trials, were proposed to evaluate AI's broader impact, like encouraging healthier choices or environmental consciousness.

This work matters because it provides actionable steps to prevent real-world harms. Historical biases, such as spirometers that underestimated lung capacity for Black individuals due to flawed training data, underscore the need for rigorous testing. The toolbox helps ensure AI systems in healthcare, finance, and other domains avoid perpetuating discrimination and protect user privacy. By integrating these methods into development cycles, organizations can build AI that aligns with ethical standards and gains public trust.

However, the researchers note limitations. Not all trustworthiness aspects can be fully automated—some require manual intervention or qualitative assessment. Challenges remain in making evaluation accessible to non-specialists and presenting results clearly. Future work must address how to converge on accepted trustworthiness standards and develop dedicated tools for unique AI characteristics. As AI's role expands, this software engineering foundation offers a crucial starting point for creating systems that are not only intelligent but also responsible and reliable.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn