GIS and Semantics: Enabling the Discoverability of Data

Mark Altaweel

Updated:

The semantic web has been a method to facilitate search and application of data based on meaning across different websites and applications using common standards. Similar techniques have been created for GIS, where finding and applying geospatial data may require protocols and standards that allow the large variety of data to be more easily searched.

With the plethora of GIS data, one can conceptualize GIS as a consolidated information infrastructure. Scale, interoperability, and complexity of data, particularly as the great diversity of data are often created and structured differently, require semantic ontologies to facilitate data transfer and use for increased needs for spatial services offered.

Semantic Models in GIS

Semantic ontologies have been designed to give a clearer understanding of spatial bounds and context of geospatial data.[1] This includes defining space outside of only physical bounds but potentially in a more abstract format.

Different data models are increasingly employed, requiring multi-dimensional and multiple ways in which data are viewed and understood.


Free weekly newsletter

Fill out your e-mail address to receive our newsletter!
Email:  

For semantic models, GIS tools need to manage these variations. Semantic similarity measurements can be used for finding relevant geographies such as in applying SIM-DL that compares similarity between concepts stored based on geographic feature types.[2]

One current example includes the OSM Semantic Network in finding similar or comparable places in OpenStreetMap.[3]

Eights different scenarios of change on a feature. From: Harbelot, Arenas, and Cruz, 2013.
Eights different scenarios of change on a feature. From: Harbelot, Arenas, and Cruz, 2013.

While defined and ill-defined geographies can pose a challenge, other challenges include spatial-temporal data in GIS that are particularly challenging for traditional databases such as relational databases.

Use of so-called “continuum” models are one way in which parent-child relationships can be assigned for spatial-temporal data such that maps can be dynamically updated based a scalable search using potentially different data models.[4]

Semantic Kriging Techniques for Poor Quality GIS Data

However, one of the bigger challenges current research is focusing on is geospatial data that have inherent errors or are missing information. Search may yield a useful result, but the data returned may have structural or quality problems.

One way to resolve data quality is the use of semantic kriging techniques that apply semantic association with ordinary kriging techniques to interpolate what missing values might be for a semantically identified dataset.[5]

Such technologies are likely to become more common as the scale of data rapidly grows.

New Guide for Improving How Geospatial Data is Discovered With Search Engines

Making GIS data easily discoverable is an essential aspect of connecting potential users with available geospatial datasets. Most users turn to common search engines like Google and Bing in order to locate GIS data they can download. In 2018 Google launched Dataset Search (which recently https://blog.google/products/search/discovering-millions-datasets-web/ beta) as one of its strategy for making geographic data more easily discoverable using its search engine. Other organizations use federated geoportals to promote discoverability of GIS data.

The United Kingdom’s Geospatial Commission has published a guidebook for GIS data publishers to consult in order to make metadata and the associated geospatial datasets easier to classify and find using common search engines.

The guidebook (in PDF format) is based on research performed by the Geo6 (British Geological Survey, The Coal Authority, HM Land Registry, Ordnance Survey, UK Hydrographic Office, and the Valuation Office Agency).

The resulting guidelines are based on search engine optimization (SEO) best practices. Contained within the guidebook are 12 focus areas that need to be considered when crafting metadata and what tools to use to understand how users search and find data.

By implementing these recommendations data publishers stand to benefit from increased visibility of their data among a wider audience, while people wanting to use geospatial data are more likely to find what they are looking for.

For example, by using this guide the Scottish Government has already identified that its webpages are not being optimally indexed by search engines because they are not tagged using the industry-standard markup language Schema.org.

Shona Nicol, Head of Data Standards, Scottish Government

Visit: Search engine optimisation for publishers: Best practice guide

References

[1] For more developed approach on semantic GIS, see: Cai, G. (2007). Contextualization of geospatial database semantics for human-GIS interaction. Geoinformatica, 11, 217–237.

[2] For more on utilizing SIM-DL, see: Janowicz, K., Schade, S., Bröring, A., Keßler, C., Maué, P., & Stasch, C. (2010). Semantic Enablement for Spatial Data Infrastructures. Transactions in GIS, 14(2), 111–129.

[3] For OpenStreetMap, see:  https://www.openstreetmap.org.

[4] For a relatively recent approach, see:  Harbelot, B., Arenas, H., & Cruz, C. (2013). Continuum: a spatiotemporal data model to represent and qualify filiation relationships (pp. 76–85). ACM Press.

[5] For more on semantic kriging, see:  Bhattacharjee, S., Mitra, P., & Ghosh, S. K. (2014). Spatial Interpolation to Predict Missing Attributes in GIS Using Semantic Kriging. IEEE Transactions on Geoscience and Remote Sensing, 52(8), 4771–4780.

Related

Photo of author
About the author
Mark Altaweel
Mark Altaweel is a Reader in Near Eastern Archaeology at the Institute of Archaeology, University College London, having held previous appointments and joint appointments at the University of Chicago, University of Alaska, and Argonne National Laboratory. Mark has an undergraduate degree in Anthropology and Masters and PhD degrees from the University of Chicago’s Department of Near Eastern Languages and Civilizations.