🏢Data Mastery Challenge: Open-Source Urban Open Space Mapping
Click here for reproducible code of GEE data processing
Click here for workflow manual in GitHub Repository
🚨Task Force
Data handling skills are indispensable for quantitative geospatial analyses in a data-rich environment, where heterogeneous datasets require integration. With limited access to full research documentation, data sources, or code, the reproducibility and replicability of quantitative (geographic information) science are typical challenges encountered by scientists and analysts. A 'challenge' is proposed to reproduce research on mapping spatial patterns of open spaces for the city of Kampala, Uganda (see https://doi.org/10.3390/rs12071144)
🎯Aims
Create a replicable and open-source workflow to map urban open spaces using freely available data.
To develop a conceptual analysis workflow on retrieving data sources and assessing the interoperability of data sets ;
To set up an appropriate data and code sharing mechanism;
To address SDG indicators related to urban dynamics, in particular open spaces;
To quantify urban dynamics by integrating datasets for one year, including pre-processing, filtering and quality check;
To include some new and interesting data as well as recent algorithms for land cover classification.
🛠️ Approach
Conceptual workflow on (1) Collecting OSM samples for classification training and validation, and (2) Mapping urban areas using Google Earth Engine and Sentinel-2 imagery.
Data management and sharing plan for project collaboration in GitHub to guide on how to use, understand and reproduce the workflow.
Evaluation of project implementation, from OSM data acquisition and pre-processing, Sentinel-2 processing and image classification.
✅Success
Random forest classification for quantifying open spaces, including Gray Spaces and Green Spaces.
Acceptable land cover classificationtraining (96%) and validation (74%) accuracy.
Water Bodies has the highest classification accuracy (96%), while Gray Space has the lowest (60%).
Computation of BuiltOpen index (2.57) showing the ratio between the total built-up area and the area that is open space.
Open, replicable and reproducible workflow based on Python, GEE, QGIS, GitHub, OSM Overpass API and Sentinel-2 imagery.
🪞 Reflection
Google Earth Engine provide high-performing cloud processing of satellite imagery.
OSM provides easy access to land cover-land use training data.
Open-source workflow to be reproducible easily from training sample acquisition to image classification.
Crowd-sourced OSM data is subject to human errors.
Cloud cover hinders the global application of Sentinel-2 imagery on land cover classification.