Understanding the dynamics of climate variables is critical for sectors like energy and environmental monitoring. This study addresses the pressing need for accurate mapping of environmental variables in national or regional monitoring networks, a challenge exacerbated by skewed data and large gaps. While this may not be immediately apparent, managing skewness across multiple data sources introduces additional complexities, as conventional transformation methods often fail to effectively normalize the data or preserve inter-dataset relationships. Furthermore, the literature highlights that interpolation uncertainty is closely linked to the interpolation distance, making the handling of large gaps particularly problematic. To tackle these challenges, we propose a novel data fusion approach: the warped multifidelity Gaussian process. This method predicts time-series data from multiple sources with varying reliability and resolution, while effectively addressing skewness and demonstrating partial independence from interpolation distance. Through extensive simulation experiments, we explore both the strengths and limitations of the method. Additionally, as a case study, we apply warped multifidelity Gaussian process (WMFGP) to wind speed data from the Agenzia regionale per la protezione ambientale (ARPA) Lombardia network, a regional environmental agency in Italy. Our results demonstrate the efficacy of WMFGP in filling large gaps in wind speed data, providing more accurate predictions that are essential for air quality forecasting, network maintenance.
Colombo, P., Miller, C., Yang, X., O'Donnell, R., Maranzano, P. (2025). Warped multifidelity Gaussian processes for data fusion of skewed environmental data. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS [10.1093/jrsssc/qlaf003].
Warped multifidelity Gaussian processes for data fusion of skewed environmental data
Maranzano, Paolo
2025
Abstract
Understanding the dynamics of climate variables is critical for sectors like energy and environmental monitoring. This study addresses the pressing need for accurate mapping of environmental variables in national or regional monitoring networks, a challenge exacerbated by skewed data and large gaps. While this may not be immediately apparent, managing skewness across multiple data sources introduces additional complexities, as conventional transformation methods often fail to effectively normalize the data or preserve inter-dataset relationships. Furthermore, the literature highlights that interpolation uncertainty is closely linked to the interpolation distance, making the handling of large gaps particularly problematic. To tackle these challenges, we propose a novel data fusion approach: the warped multifidelity Gaussian process. This method predicts time-series data from multiple sources with varying reliability and resolution, while effectively addressing skewness and demonstrating partial independence from interpolation distance. Through extensive simulation experiments, we explore both the strengths and limitations of the method. Additionally, as a case study, we apply warped multifidelity Gaussian process (WMFGP) to wind speed data from the Agenzia regionale per la protezione ambientale (ARPA) Lombardia network, a regional environmental agency in Italy. Our results demonstrate the efficacy of WMFGP in filling large gaps in wind speed data, providing more accurate predictions that are essential for air quality forecasting, network maintenance.File | Dimensione | Formato | |
---|---|---|---|
Colombo-2025-JRSSC-VoR.pdf
accesso aperto
Tipologia di allegato:
Publisher’s Version (Version of Record, VoR)
Licenza:
Creative Commons
Dimensione
1.22 MB
Formato
Adobe PDF
|
1.22 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.