How to Detect AI-Generated Images: A New Framework

TL;DR

A new detection framework helps identify AI-generated images more accurately, giving researchers and platforms a reliable way to spot synthetic visuals.

As generative AI models like Midjourney and Stable Diffusion flood the internet with increasingly realistic synthetic images, the ability to distinguish these fakes from authentic content has become a critical for digital media integrity. This problem is exacerbated by the rapid release of new generative models, which quickly outpace traditional detection s that rely on periodic retraining. A recent study from Universidad Politécnica de Madrid introduces a novel two-stage framework that combines supervised contrastive learning with few-shot classification to address this generalization gap, achieving state-of-the-art performance in detecting and attributing AI-generated images without exhaustive retraining. This advancement is crucial for combating misinformation and ensuring trust in visual media, as synthetic images can be used to create deceptive content that influences public opinion and security.

The proposed framework employs a MambaVision-L3-256-21K model trained using supervised contrastive loss to extract discriminative embeddings from images, focusing on pulling similar samples closer in the latent space while pushing dissimilar ones apart. This approach was trained on strategically partitioned subsets of generators, with some architectures withheld to test cross-generator generalization. In the second stage, a k-nearest neighbors (k-NN) classifier operates in a few-shot learning paradigm, using only 150 images per class from unseen generators to make predictions. Experiments utilized datasets like GenImage and ForenSynths, with models evaluated on both seen and unseen generators to simulate real-world conditions where new models emerge frequently. The training involved large batch sizes of up to 6,000 to enhance contrastive learning, and all images were resized to 256x256 pixels for consistency, with preprocessing limited to pixel normalization to avoid introducing biases.

Demonstrate that the framework achieves an average detection accuracy of 91.3% for fake image detection, outperforming previous state-of-the-art s by 5.2 percentage points and showing superior stability across diverse generators like ADM, BigGAN, and Midjourney. For source attribution, models like ES1 and ES4 achieved F1-scores of up to 97.3% on seen generators and around 52% on unseen ones, highlighting the importance of generator selection in training. In open-set attribution, improved AUC by 14.70% and OSCR by 4.27% compared to existing approaches, indicating robust generalization. Sensitivity analysis revealed that performance gains plateau with more than 150 few-shot instances, making this a practical choice for real-world deployment where data collection is limited.

Of this research are significant for cybersecurity, privacy, and digital forensics, as it provides a scalable solution to the evolving threat of AI-generated imagery. By reducing the need for frequent retraining, the framework lowers computational costs and operational barriers, enabling faster adaptation to new generative models. This could aid in detecting deepfakes in social media, verifying content in journalism, and protecting against fraud in sectors like finance and law enforcement. The study's focus on explainability, using tools like LIME, also helps build trust in AI systems by revealing which image features influence detection, potentially guiding future regulatory and ethical standards for synthetic media.

Despite its strengths, the framework has limitations, such as reduced performance on certain diffusion-based generators like Stable Diffusion 1.4 and 1.5, which produce similar artifacts that discrimination. The reliance on specific training generators means that model effectiveness can vary based on the diversity of the training set, and the current embedding dimensionality and batch size constraints may limit further scaling. Future work could explore hard negative mining to enhance contrastive learning or extend the approach to other media types like video and audio. Overall, this study marks a pivotal step toward resilient forensic systems capable of keeping pace with generative AI advancements, with potential applications in education, entertainment, and security.

About the Author

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn