Underwater imaging is critical for marine exploration, robotics, and infrastructure inspection, but it faces a persistent : light attenuation and scattering in water severely degrade image quality, producing dominant blue-green tones and blurry details. Traditional enhancement s often rely on complex physics-based models or heavy deep learning architectures that struggle with real-time deployment on resource-limited platforms. Now, researchers from National Cheng Kung University and National Yang Ming Chiao Tung University have developed WWE-UIE, a novel framework that combines interpretable domain priors with efficient neural design to achieve state-of-the-art restoration while being compact enough for practical applications. This breakthrough addresses the fundamental trade-off between enhancement quality and computational efficiency that has long plagued underwater computer vision.
At its core, WWE-UIE integrates three carefully designed components that work synergistically. First, an adaptive white balance module corrects the severe color imbalance caused by wavelength-dependent attenuation, where red light is absorbed much faster than blue and green. Unlike previous approaches that rely on fragile transmission estimation, this module uses a learnable fusion of the original image and its white-balanced counterpart, regulated by channel-wise parameters constrained to [0, 1]. This design directly addresses color distortion without unstable intermediate estimations, enhancing robustness across diverse underwater conditions with minimal computational overhead. The researchers drew inspiration from the Gray-World assumption, which has proven effective in low-light and tone-mapping tasks, adapting it specifically for underwater environments where color casts are particularly pronounced.
The second key innovation is the Wavelet-based Enhancement Block (WEB), which performs efficient multi-band decomposition using Haar wavelet filters. Unlike Fourier transforms that provide global frequency analysis but lose spatial localization, the Discrete Wavelet Transform enables targeted enhancement of degraded regions while maintaining global structural consistency. The WEB decomposes input feature maps into four subbands: one approximation component (LL) and three directional detail components (LH, HL, HH). These are then concatenated and compressed by a 1×1 convolution before being refined through a Depthwise-Separable Half Instance Normalization Block. This approach allows the network to capture both global color consistency and local fine textures—critical for underwater restoration where color attenuation mainly affects low-frequency structures while detail loss occurs in high-frequency textures.
To address scattering-induced blurring, WWE-UIE incorporates a Sobel Gradient Fusion Block (SGFB) that explicitly preserves edge structures degraded by suspended particles. The module leverages Sobel operators to generate robust first-order directional gradients, which are then compressed through depthwise-separable convolution with sigmoid activation to produce a fine-grained gating map. This map provides pixel-level modulation highlighting edge regions, while a learnable scalar serves as a coarse-level controller balancing gradient-enhanced and baseline features. Compared to previous approaches like ReX-Net that integrate spatial-channel attention implicitly or SFGNet that uses multiple convolutional layers with additional parameters, SGFB delivers clearer edge enhancement with minimal overhead, producing sharper restoration under challenging scattering conditions.
The researchers trained WWE-UIE using a composite loss function specifically designed for underwater imagery's heterogeneous degradations. Beyond standard Charbonnier and SSIM losses, they incorporated perceptual loss via VGG feature distance, edge-aware loss using Sobel gradient differences, and a novel HVI loss that improves color fidelity in a more stable color space. The HVI space decouples chromaticity and intensity for robust supervision, avoiding issues with HSV's red-hue discontinuity and CIELab's nonlinear hue distortions. This comprehensive supervision strategy, with carefully tuned weights (λ1=1, λ2=0.1, λ3=0.1, λ4=0.4, λ5=0.5), enables the model to address structural fidelity, perceptual quality, and color restoration simultaneously.
Extensive experiments on multiple benchmark datasets demonstrate WWE-UIE's impressive performance. On full-reference datasets including UIEB, LSUI, UFO-120, and EUVP, the model achieved PSNR scores up to 24.32 and SSIM up to 0.9196—competitive with or superior to nine state-of-the-art s while using substantially fewer parameters (0.734M vs. competitors' 1.779M to 15.902M) and FLOPs (7.602G vs. 13.041G to 57.457G). A reduced variant (Ours-E) with lower embedding dimensions further decreased parameters to 0.471M and FLOPs to 3.603G with only marginal performance drops. On real-world non-reference datasets Challenging-60 and U45, WWE-UIE achieved UCIQE scores up to 0.6110 and superior color correction with CIEDE2000 scores as low as 10.0979 on the Color-Check7 dataset, confirming its advantage in restoring perceptually accurate colors.
Visual comparisons reveal WWE-UIE's practical superiority: the model restores more natural tones and sharper structures than competing s under extreme conditions including dark, yellow, green, and blue color biases. Analysis in CIE 1931 xyY color space shows WWE-UIE's distribution closely matches ground truth in both chromaticity and luminance, demonstrating effective correction of color casts and illumination imbalances. The framework also shows promising cross-domain potential, with exploratory experiments on low-light and foggy scenes yielding clearer textures and improved contrast, suggesting the prior-guided design may extend beyond underwater enhancement to broader restoration tasks.
Despite its strengths, WWE-UIE has limitations worth noting. The model's performance, while competitive, doesn't always achieve the absolute highest scores across all metrics—PhaseFormer, for instance, achieved slightly better PSNR on some datasets. The approach also relies on synthetic training data, which may not fully capture the complexity of real-world underwater environments. Additionally, while efficient, the model still requires GPU acceleration for real-time performance, potentially limiting deployment on extremely resource-constrained edge devices. Future work could explore further compression techniques, adaptation to more diverse underwater conditions, and integration with downstream vision tasks like object detection and segmentation.
Of this research extend beyond academic benchmarks to practical marine applications. By enabling high-quality underwater image enhancement with minimal computational cost, WWE-UIE could facilitate real-time autonomous navigation for underwater vehicles, improve marine ecological monitoring, and enhance infrastructure inspection capabilities. The framework's interpretable design—combining white balance correction, wavelet decomposition, and gradient-aware refinement—also provides valuable insights for the broader computer vision community working on image restoration in challenging environments. As underwater exploration and robotics continue to advance, efficient enhancement algorithms like WWE-UIE will play a crucial role in unlocking the visual potential of our oceans.
Reference: Ching-Heng Cheng, Jen-Wei Lee, Chia-Ming Lee, Chih-Chung Hsu, 2025, arXiv:2511.16321v1 [cs.CV]
Original Source
Read the complete research paper
About the Author
Guilherme A.
Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.
Connect on LinkedIn