Decomposing Satellite-Based Classification Uncertainties in Large Earth Science Datasets
preprintposted on 2022-02-01, 03:09 authored by Pedro OrtizPedro Ortiz, Marko OrescaninMarko Orescanin, Scott Powell, Veljko Petkovic, Benjamin Marsh
Collection of increasingly voluminous multi-spectral data from multiple instruments with high spatial resolution has posed both an opportunity and a challenge for maximizing their utilization, analysis, and impact. Obtaining accurate estimates of precipitation globally with high temporal resolution is crucial for assessing multi-scale hydrologic impacts and providing a constraint for development of numerical models of the atmosphere that provide weather and climate predictions. Precipitation type classification plays an important role in constraining both the inverse problem in satellite precipitation retrievals and latent heat transfer within weather prediction simulations. Precipitation type, however, is often reported deterministically, without uncertainty attached to an estimate. Machine learning techniques are capable of extracting content of interest from large datasets and accurately retrieving discrete and continuous properties of physical systems, but with limited insights to the retrieval components–such as errors and the physical relationship between the observed and retrieved properties. To address this shortcoming, we perform precipitation type classification to introduce a novel tool for decomposing errors of satellite-retrieved products. We use Bayesian neural networks to map Global Precipitation Measurement mission Microwave Imager observations to Dual-frequency Precipitation Radar-derived precipitation type, which perform comparably to deterministic models, but with the added benefit of providing well calibrated uncertainties. Through uncertainty decomposition, we demonstrate well calibrated uncertainties as useful for making decisions concerning high uncertainty predictions, model selection, targeted data analysis, and data collection and processing. Additionally, our Bayesian models enable mathematical confirmation of a data distribution change as the cause for an unacceptable decline in model accuracy.