About Geospatial Data
One of the GDL’s most recent work in progress is the development of a geospatial indicators dataset. The goal is to create indicators on topics related to climate, environment, and population that are spatially linked to the 1,803 GDL sub-national regions. The raw data for these indicators are extracted from freely available global datasets and made accessible to users through the Geospatial Data dashboard on the GDL website. Users also have the option to access data through the R library gdldata, provided by our team.
These indicators, commonly referred to as covariates in empirical research, present new opportunities for research, education, and international cooperation. They offer valuable insights, and enrich our understanding about geo-spatial measures of weather and climate change worldwide. The demand for such data is steadily increasing among students and the international research community. By developing spatial indicators and integrating them into the GDL infrastructure, we aim to meet this demand as much as possible. Beyond research and education, the availability of these indicators empowers individuals, including those with limited GIS expertise, to effectively engage in geospatial analyses.
Background
In the first step, we created and mapped annual temperature, precipitation, and relative humidity data for the GDL sub-national regions spanning from 1990 to 2022[1]. These variables were constructed using raster datasets sourced from the European Centre for Medium-Range Weather Forecasts (ECMWF). The ERA5 datasets result from atmospheric reanalysis, a process that integrates satellite-based measurements, ground-based observations, and modeling. This approach employs the principles of physics through data assimilation, similar to the methodology used by numerical weather prediction centers. ERA5 reanalysis data are widely recognized for their high quality, granular spatial and temporal resolutions, and inclusion of a wide range of atmospheric variables.
As the next step, we expanded our dataset to include annual total CO2[2] and PM2.5[3] emissions, constructed using raster datasets from the Emissions Database for Global Atmospheric Research (EDGAR). EDGAR is a multipurpose, independent, global database of anthropogenic emissions of greenhouse gases and air pollutants, such as CO2[2] and PM2.5[3]. It provides independent emission estimates using international statistics and a consistent IPCC methodology.
To ensure consistency, we processed the weather and emissions data in a similar way. We obtained monthly ERA5 estimates for surface temperature, dew point temperature, and total precipitation at a spatial resolution of 0.25° x 0.25° (equivalent to 25 x 25 km at the equator). We then used surface temperature and dew point temperature to calculate relative humidity. Annual variables were constructed by aggregating cell values from monthly raster datasets. For emissions data, we used EDGAR emissions annual grid maps at a 0.1° x 0.1° (equivalent to 10 x 10 km at the equator) degree resolution. The resulting annual datasets were resampled to a spatial resolution of 2 x 2 km, re-projected to a uniform coordinate reference system, linked to the GDL shapefile of sub-national regions, and then aggregated at the GDL regional level using scripts coded in the R programming language.
Future work
The combination of GDL and spatial indicators has proven immensely fruitful, particularly in education and climate research. To illustrate, our BA and MA students successfully combine the GDL area database and geospatial indicators to address important research questions, including the estimation of climate change impacts on food production, inequality, and health outcomes in Africa. This diversity of real-world applications highlights the breadth of challenges we are addressing and underscores the relevance of our work for the Global South. Currently, we are actively exploring further opportunities and extensions to our spatial indicators. We plan to incorporate a broader range of covariates and to expand the temporal scope of our data, offering options for analyzing data at annual, monthly, or even daily intervals, tailored to specific research demands. Additionally, we are investigating additional spatial matches, encompassing nightlight luminosity, 3G network expansion, Google Trends, PRIO conflict data, EM-DAT disaster database, and firm-level data. We anticipate that these extensions will significantly augment the depth and breadth of our educational, research, and internationalization endeavors.
References
[1] Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., Thépaut, J-N. (2023): ERA5 monthly averaged data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), DOI: 10.24381/cds.f17050d7 (Accessed on 08-11-2023)
[2] Crippa, M., Guizzardi, D., Pagani, F., Banja, M., Muntean, M., Schaaf E., Becker, W., Monforti-Ferrario, F., Quadrelli, R., Risquez Martin, A., Taghavi-Moharamli, P., Köykkä, J., Grassi, G., Rossi, S., Brandao De Melo, J., Oom, D., Branco, A., San-Miguel, J., Vignati, E., GHG emissions of all world countries, Publications Office of the European Union, Luxembourg, 2023, doi:10.2760/953322, JRC134504.