AIResearch AIResearch
Back to articles
Network

AI Learns to Handle Missing Data Privately

AI now handles missing data without compromising privacy, boosting accuracy by over 36% in critical applications like health monitoring. This breakthrough ensures reliable predictions even when sensors fail or data is incomplete.

AI Research
November 14, 2025
3 min read
AI Learns to Handle Missing Data Privately

In today's interconnected world, devices like smartphones and health monitors collect diverse data types—from audio to physiological signals—but often miss key information due to sensor failures or privacy constraints. This incompatibility hampers collaborative AI training, where multiple devices work together without sharing raw data. A new study introduces a method that enables AI systems to adapt to these gaps, improving accuracy by up to 36.45% in scenarios with severe data missingness, all while preserving user privacy. This advancement is crucial for applications like wearable health monitoring and environmental sensing, where reliable predictions depend on combining incomplete data sources securely.

The researchers developed a framework called PEPSY that allows AI models to reconfigure their internal representations to handle missing data. Instead of assuming all devices have access to the same data types, PEPSY learns a profile for each client that captures its unique data patterns, such as which modalities are available or missing. These profiles act as instructions to adjust the global AI model locally, ensuring it works effectively even when parts of the data are absent. For example, if a health monitor lacks audio data but has heart rate signals, PEPSY guides the model to focus on the available information without compromising performance.

To achieve this, the method uses client-side controls that encode data-missing patterns, which are shared with a central server. The server aggregates these controls using a probabilistic clustering approach, grouping similar profiles to enhance robustness. This process avoids sharing raw data, aligning with privacy requirements. The framework was tested on datasets like PTBXL (with 12 modalities, such as electrocardiograms) and Sleep-EDF (with 5 modalities, including sleep stages), simulating real-world conditions where data is incomplete and distributed unevenly across clients.

Results show that PEPSY consistently outperforms existing methods, especially in non-identical data distributions where clients have varying data types. In tests, it maintained high accuracy even when up to 80% of modalities were missing, whereas traditional approaches degraded rapidly. For instance, on the PTBXL dataset, PEPSY achieved up to a 15.83% improvement over the next best method in challenging scenarios. The study includes a theoretical analysis confirming that the model's predictions remain stable despite missing inputs, backed by empirical evidence from multiple benchmarks.

This work matters because it addresses a common problem in federated learning systems used in healthcare, smart infrastructure, and IoT devices. By enabling AI to function reliably with fragmented data, it supports applications where privacy is paramount, such as analyzing medical records without exposing sensitive information. The approach could lead to more resilient distributed AI systems that adapt to real-world imperfections, reducing the need for data centralization and enhancing trust in collaborative technologies.

Limitations include potential challenges when applied to domains with vastly different data characteristics, as the method may require adjustments for optimal performance. The paper notes that future research could explore integrating pre-trained models to improve efficiency, but current evaluations rely on training from scratch. Overall, this method provides a flexible solution for handling data heterogeneity in privacy-preserving AI, paving the way for more robust and accessible intelligent systems.

About the Author

Guilherme A.

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn