Robust Evaluation of Neural Networks Trained on the OpenContrails Dataset

mariaper873877d087

2 months ago

Our latest study, “Robust Evaluation of Neural Networks Trained on the OpenContrails Dataset” has been published in IEEE Transactions on Geoscience and Remote Sensing

In this work, Irene Ortiz Abuja examines the limitations of current contrail detection models trained on geostationary satellite data and discuss implications for real-world use and future dataset development.

Complementary to this research, Irene is developing ContrAI, an open-source library that aims at bringing together essential tools for contrail processing from satellite data. An early version of the library is already available, with additional functionalities coming soon.

Robust Evaluation of Neural Networks Trained on the OpenContrails Dataset. Irene Ortiz, Javier García-Heras, Amin Jafarimoghaddam, Manuel Soler. IEEE Transactions on Geoscience and Remote Sensing, Volume 63, November 2025, DOI: 10.1109/TGRS.2025.3629628

Abstract

Aviation contributes significantly to global warming through both CO2 and non-CO2 emissions, with persistent contrails and aviation-induced cloudiness acting as major drivers of radiative forcing. Accurate identification of these phenomena is therefore critical for assessing aviation’s environmental impact. In this work, we train six neural network architectures on the leading OpenContrails dataset and systematically evaluate their ensemble performance across both segmentation and detection tasks. For segmentation, we address the sensitivity to pixel-level annotation noise by introducing the boundary soft (BS γβ ) evaluation framework, which incorporates spatial tolerance via smoothing parameters γ and β . With γ=β=1 , our ensemble achieves a global dice score (GDS) of 81.25%, increasing to 87.26% under a more relaxed setting. We also estimate a theoretical GDS upper bound of 88%, based on interannotator disagreement, indicating that current models are nearing the dataset’s performance ceiling. This insight helps explain the plateau in segmentation performance observed in recent literature and underscores the need to prioritize enhancements in data quality, annotation consistency, and evaluation methodologies over further architectural refinements. In terms of detection, our ensemble identifies 93% of target contrail features with a false positive (FP) rate below 3%. Key challenges arise in scenes containing thick ice clouds, small and isolated contrail segments, and aged, diffused contrails. To address the latter, we propose the optical flow correction (OFC) algorithm—a postprocessing step for the detection of aged contrails and their distinction from natural cirrus. We also outline potential strategies to mitigate the other observed limitations. Overall, this study offers a solid foundation for identifying current challenges and advancing the development of more effective contrail detection methods.

Share this: