AI Method Makes Fairer Machine Learning Models

TL;DR

A new AI technique closes performance gaps in biased classifiers, improving reliability for medical diagnosis and other high-stakes decisions.

Machine learning models often fail when it matters most, delivering inconsistent results for specific groups within a class. For example, a skin cancer classifier might be accurate for images without bandages but perform poorly on those with bandages, leading to unreliable diagnoses. This issue arises because models can rely on spurious features—like the presence of a bandage—instead of the actual content, such as the lesion itself. To address this, researchers have developed Model Patching, a two-stage framework that improves model robustness by encouraging invariance to subgroup-specific differences, focusing only on shared, relevant information. This approach is crucial for real-world applications where fairness and accuracy are paramount, such as in healthcare or autonomous systems, ensuring that AI tools work reliably for everyone, not just on average.

The key finding from the research is that Model Patching, implemented as CAMEL (Augmented Model Patching), significantly reduces performance gaps between subgroups. In benchmarks, it cut error rates by up to 33% compared to the best existing methods. For instance, on a skin cancer dataset, CAMEL improved accuracy on malignant images with spurious features by 11.7%, shifting the model's focus from irrelevant bandages to the actual skin lesions. This means AI systems can now make more consistent predictions across different scenarios, reducing the risk of biased outcomes in critical areas like medical imaging or facial recognition.

The methodology involves two main stages. First, the framework learns transformations that change subgroup-specific features—like altering the appearance of a bandage in an image—without affecting the class label, such as whether a lesion is benign or malignant. This is done using models like CycleGAN to generate augmented examples that simulate different subgroups. Second, it trains a classifier using these augmentations, combined with a robust objective and a consistency regularizer, to ensure the model becomes invariant to the manipulated features. Essentially, it teaches the AI to ignore distractions and focus on what truly matters for accurate classification, using a theoretically motivated approach that balances subgroup performance.

Results from the paper show clear improvements. On controlled datasets like MNIST-Correlation, CAMEL reduced the performance gap—the difference between the best and worst subgroup accuracies—to as low as 0.17%, compared to 13.45% for standard methods. In real-world tests on the ISIC skin cancer dataset, it boosted robust accuracy by 11.7% and improved performance on malignant cases from 65.59% to 78.86%. The analysis, including metrics like mutual information estimates, confirmed that CAMEL learns more invariant representations, meaning the model depends less on spurious features and more on shared class information. For example, in the Waterbirds dataset, it cut the gap from 22.24% to 1.04%, demonstrating its effectiveness across various domains.

In practical terms, this breakthrough matters because it makes AI systems fairer and more dependable. In healthcare, it could prevent misdiagnoses in underrepresented groups, such as ensuring skin cancer detectors work equally well regardless of skin tone or accessories like bandages. For everyday users, it means AI in apps or services will be less likely to make errors based on irrelevant details, enhancing trust in technologies from recommendation engines to security systems. By closing subgroup gaps, Model Patching addresses ethical concerns like discrimination, paving the way for AI that serves diverse populations reliably.

However, the approach has limitations. The paper notes that it relies on the availability of subgroup labels during training, which may not always be practical. Additionally, while it reduces dependence on spurious features, it does not eliminate all sources of bias, and its performance can vary with the quality of the learned transformations. Future work is needed to adapt it to scenarios with unknown subgroups or more complex data, ensuring it can be applied broadly without extensive manual input.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn