AIResearch AIResearch
Back to articles
AI

AI Generates Realistic Medical Videos Without Patient Data

A new AI method creates synthetic ultrasound videos so accurate that clinicians can't tell them apart from real ones, potentially transforming how medical datasets are built and shared while preserving privacy.

AI Research
November 06, 2025
3 min read
AI Generates Realistic Medical Videos Without Patient Data

Medical imaging is crucial for diagnosing diseases, but creating diverse and balanced datasets for training AI models is often hindered by missing or imbalanced data types. This challenge is particularly acute in ultrasound, where different imaging sub-modalities like color flow Doppler (CFD) and standard greyscale (B-mode) are not always available together. A new study introduces an AI-driven video translation method that generates realistic B-mode ultrasound videos from CFD inputs, effectively balancing datasets without discarding valuable information. This breakthrough could enhance AI development in medicine by improving data quality and accessibility, all while maintaining patient privacy.

The researchers developed a video-to-video translation approach that converts CFD ultrasound videos into synthetic B-mode videos, preserving anatomical structures and realism. Using a dataset of 54,975 training videos and testing on 8,368 videos, the method achieved an average structural similarity index (SSIM) of 0.91±0.04 compared to real B-mode videos, indicating high visual fidelity. In tests, both AI models and human experts struggled to distinguish synthetic from real videos, with clinicians achieving only 54±6% accuracy—no better than chance. This demonstrates the method's ability to produce videos that are functionally and perceptually equivalent to real ones.

The methodology employs a two-step pipeline: a coarse network first reconstructs the basic structures from CFD inputs, followed by a refinement network that enhances textures using adversarial and perceptual losses. This approach uses gated convolutions to handle freeform regions and adaptively focus on areas needing reconstruction, without requiring paired data during inference. The model was trained on adult and fetal cardiac ultrasounds, with preprocessing steps to estimate masks for CFD regions and apply augmentations like rotation and noise to improve robustness.

Results show that synthetic videos performed indistinguishably from real ones in benchmark tasks. For view classification, F1 scores were 0.9 for real and 0.89 for synthetic videos in adult ultrasounds, with no significant difference. In segmentation tasks, Dice scores averaged 0.97±0.03, closely matching real videos. Additionally, the method generalized to unseen anatomical structures, such as obstetric and other organ ultrasounds, achieving an average SSIM of 0.91±0.05 in zero-shot tests. This indicates the model's foundational capabilities across diverse medical domains.

The implications are significant for medical AI, as this method can balance datasets by generating synthetic data, increasing representation of underrepresented classes. For example, in fetal ultrasound datasets, it boosted the clip-level diversity of underrepresented sites from 7.1% to 19.8%. This could accelerate AI development in resource-limited settings by enabling the use of retrospective data without compromising patient confidentiality. Moreover, it addresses ethical concerns by reducing the need for extensive data collection, potentially lowering costs and improving equity in healthcare.

Limitations include occasional blurriness or artifacts in outputs, particularly in fetal ultrasounds where small, unpredictable motions pose challenges. The method also requires preprocessing to generate masks, and its performance slightly decreases with higher CFD coverage. Future work could explore diffusion-based models for improved detail and end-to-end pipelines to eliminate preprocessing steps. Despite these constraints, the study provides a robust framework for generative AI in medical imaging, emphasizing the importance of rigorous evaluation in high-stakes applications.

Original Source

Read the complete research paper

View on arXiv

About the Author

Guilherme A.

Guilherme A.

Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.

Connect on LinkedIn