Datacubes in GIS

Mark Altaweel

Updated:

Datacubes (or data cubes) are a form of data structure, where data are stored in multidimensional arrays (n-D arrays); the data contain one or more spatial or temporal dimensions. Datacubes provide an effective way to apply analysis on spatiotemporal data, where data incorporate both raster and vector data along with potentially other information.

One way to think of this is as data that represent specific regions or areas, as one possibility, but with added timestamps, with different data formats combined. Data do not need to be represented in a specified dimension, but they are placed in varied dimensions, with shape and arrangement of data designed for a project or application.[1]

Datacubes typically have dimensions with given properties common to different formats of spatial data. Typical properties include:  name, axis / number, type of data, extent/nominal dimension labels, reference system or projection, and resolution.

Benefits of Datacubes for Geospatial Analysis

In terms of application, datacubes provide a lot of benefits for a variety of analyses. These properties are stored as either numbers or strings (i.e., text) data. Additional data, such as the CRS or sensor information, can be turned into a dimension attribute. Data are stored as scalar as well as dimensions.

Earth observation data cube and its dimensional axes. Figure: Kopp, S., Becker, P., Doshi, A., Wright, D. J., Zhang, K., & Xu, H. (2019). Achieving the full vision of earth observation data cubes. Data, 4(3), 94. CC BY 4.0.
Earth observation data cube and its dimensional axes. Figure: Kopp, S., Becker, P., Doshi, A., Wright, D. J., Zhang, K., & Xu, H. (2019). Achieving the full vision of earth observation data cubes. Data, 4(3), 94. CC BY 4.0.

Increasingly, Earth observation (EO) data are stored as datacubes, helping to assist the analytical process. In particular, as satellite data often deal with a temporal attributes, datacubes make an ideal way in which data are stored and processed for different analyses, including in machine learning algorithms or statistical procedures operating matrix algebraic equations.



Free weekly newsletter

Fill out your e-mail address to receive our newsletter!
Email:  

By entering your email address you agree to receive our newsletter and agree with our privacy policy.
You may unsubscribe at any time.



Multiple satellite data types as well as temporal datasets can be stored within datacubes for given regions. Increasingly, we are seeing datacubes also applied in cloud-based systems, making the storage, distribution, and association of related data easier since datacubes can combine spatial and temporal information. This has the benefit of grouping analyses together, where data are bundled and applied in distributed forms of calculations for larger datasets.[2]

Datacubes Enable Easier Management of Geospatial Data

A benefit of datacubes is they enable easier updates and modification to existing regional data. Increasingly, programmes, called spatial online analytical processing (SOAP), are being developed to process and deploy datacubes for dynamic maps and even charts and tables containing useful information about given imagery.

Additionally, SOAP-based programs can be created to handle data at a meta-level, where data storage as well as handling data sets are controlled by the software.[3] In effect, this makes datacubes efficient both as a data model for analysis as well as a storage tool to facilitate display of information.

The Swiss Data Cube; viewer showing snow cover change over Switzerland between 1995 and 2017. Figure: Giuliani, G., Masó, J., Mazzetti, P., Nativi, S., & Zabala, A. (2019). Paving the way to increased interoperability of earth observations data cubes. Data, 4(3), 113. CC BY 4.0
The Swiss Data Cube; viewer showing snow cover change over Switzerland between 1995 and 2017. Figure: Giuliani, G., Masó, J., Mazzetti, P., Nativi, S., & Zabala, A. (2019). Paving the way to increased interoperability of earth observations data cubes. Data, 4(3), 113. CC BY 4.0

What is also useful about datacubes is they can combine all types of information, targeting spatial and temporal data for different domains. For instance, if data are collected about habitats and different biological species, such information can be wrapped along with the spatial and temporal dimensions. This provides automated and analytical ways to update information about maps of interest. For instance, varied habitat change can be easily mapped over time.[4]

The intent of datacubes is to facilitate data transfer, analysis, and storage. GIS systems are taking advantage of datacube formats. We are seeing the application of datacubes today to a variety of tools, while both open and proprietary formats are available.[5] 

One of the benefits of datacubes is not only they help to make dynamic mapping easier, but datacubes could be a key way for data interoperability. For instance, information from different satellites stored in datacube dimensions could make it easier to share information containing different EO data. Particularly for EO, datacubes have become a key way users interact with large datasets containing varied imagery.[6] 

Furthermore, datacubes are applicable for all types of information, enabling non-spatial data to be integrated as part of data sharing and analysis. With varied data sources, including satellite, UAV, and other spatial data increasingly made available, we may expect that datacubes could be among the most common way to integrate and utilised such data together for large, analytical projects and where large datasets are needed for dynamic mapping and analysis. 

References

[1]    As a general background on datacubes, see:  https://openeo.org/documentation/1.0/datacubes.html#apply.

[2]    For more on how datacubes could be applied, particularly to satellite data, see:  Appel, M., & Pebesma, E. (2019). On-Demand Processing of Data Cubes from Satellite Image Collections with the gdalcubes Library. Data4(3), 92. https://doi.org/10.3390/data4030092.

[3]    For more on SOAP type work, see:  Kasprzyk, J.-P., & Devillet, G. (2021). A Data Cube Metamodel for Geographic Analysis Involving Heterogeneous Dimensions. ISPRS International Journal of Geo-Information10(2), 87. https://doi.org/10.3390/ijgi10020087.

[4]    For more on using datacubes for habitat mapping, see:  Agrillo, E., Filipponi, F., Pezzarossa, A., Casella, L., Smiraglia, D., Orasi, A., et al. (2021). Earth Observation and Biodiversity Big Data for Forest Habitat Types Classification and Mapping. Remote Sensing13(7), 1231. https://doi.org/10.3390/rs13071231.

[5]    For an open format for datacubes, see;  https://datacube-core.readthedocs.io/en/latest/.

[6]    For more on the importance of datacubes for EO datasets, see:  Giuliani, G., Masó, J., Mazzetti, P., Nativi, S., & Zabala, A. (2019). Paving the Way to Increased Interoperability of Earth Observations Data Cubes. Data4(3), 113. https://doi.org/10.3390/data4030113.

Related

Photo of author
About the author
Mark Altaweel
Mark Altaweel is a Reader in Near Eastern Archaeology at the Institute of Archaeology, University College London, having held previous appointments and joint appointments at the University of Chicago, University of Alaska, and Argonne National Laboratory. Mark has an undergraduate degree in Anthropology and Masters and PhD degrees from the University of Chicago’s Department of Near Eastern Languages and Civilizations.