loading page

Auditing geospatial datasets for biases: using global building datasets for disaster risk management
  • Caroline Gevaert,
  • Thomas Buunk,
  • Marc J C Van Den Homberg
Caroline Gevaert

Corresponding Author:[email protected]

Author Profile
Thomas Buunk
Marc J C Van Den Homberg


The presence of biases has been demonstrated in a wide range of machine learning applications, yet it is not yet widespread in the case of geospatial datasets. This manuscript illustrates the importance of auditing geospatial datasets for biases with a particular focus on disaster risk management applications, as lack of local data may direct humanitarian actors to utilize global building datasets to estimate damage and the distribution of aid efforts. It is important to ensure there are no biases against the representation of vulnerable populations and that they are not missed in the distribution of aid. This manuscript audits four global building datasets (Google Open Buildings, Microsoft Bing Maps Building Footprints, Overture Maps Foundation, and OpenStreetMap) for biases with regard to Relative Wealth Index, population density, urban/rural proportions, and building size in Tanzania and the Philippines. Dataset accuracies for these two countries are lower than expected. Google Open Buildings (with a confidence above 0.7) and OpenStreetMap demonstrated the best combinations of False Negative and False Discovery, though Google Open Buildings was more consistent across tiles. The equality of opportunity was lowest for the urban/rural proportions, whereas OpenStreetMap and Overture Maps Foundation displayed particularly low equality of opportunity for population density and RWI in Tanzania. These results demonstrate that there are biases in these geospatial datasets. The types of biases are not consistent across datasets and the two study areas which emphasizes the importance of auditing these datasets for biases for new applications and study areas.
This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
11 Jan 2024Submitted to TechRxiv
22 Jan 2024Published in TechRxiv