GIS and Natural Language Processing

Mark Altaweel


Natural language processing (NLP) is a growing area of unstructured data analysis and computational methodology using texts from a variety of sources. Natural language processing has been one way scholars have approached issues of big data or large datasets that, in particular, do not have an easy way to be parsed or processed using standard data retrieval methods.[1] Within GIS, NLP can be utilized for spatial understanding of where events, places, or people may relate to a given phenomenon. Typically, NLP has been used to derive meaning from a large body of corpora in an automated fashion, often using statistical or artificial intelligence techniques, where data are obtained using web scrapes or document searches.

Now, we are beginning to see in the wider research literature methods and techniques developed to understand a variety of topics in a spatial analytical framework and within spatial data gathering. For instance, natural conversation may reveal patterns regarding places people converse about or are interested in during everyday speech or in data recorded such as in tweets, blogs, or web sites. These data can now be processes to provide new knowledge about locality of where event patterns are occurring, how they connect to other events, such as natural disasters, and what may happen after given events. NLP can be utilized to recognize parts of speech, sentence structural patterns, word frequencies, or even local dialects or slang terms. Geography referenced by text could also be in the form of more vague references to places (e.g., a city rather than a specific city), where machine learning techniques can then be utilized to inform the likelihood of what city or area the vague reference might be referring to.

Extracted events relating to Hurricane Sandy from 50 CNN news reports for the period Oct 24–Nov 04, 2012. From: Wang & Stewart, 2015).
Extracted events relating to Hurricane Sandy from 50 CNN news reports for the period Oct 24–Nov 04, 2012. From: Wang & Stewart, 2015).

One common usage of NLP has been for tracking natural disasters.[2] As we continue to see NLP being utilized in a variety of disciplines, relatively recent advancements in GIS now allow georeferencing and analyzing spatial understanding of text within unstructured formats. Web-based and open source tools such as QGIS are increasingly utilized along with NLP methods.


Free weekly newsletter

Fill out your e-mail address to receive our newsletter!

[1] For more information on NLP, see: Lehnert, W. G. (Ed.). (1982). Strategies for natural language processing (1. ed). Hillsdale, New Jersey: Lawrence Erlbaum.

[2] For more information on an example of a recent use of NLP related to natural disasters, see:  Wang, W., & Stewart, K. (2015). Spatiotemporal and semantic information extraction from Web news reports about natural hazards. Computers, Environment and Urban Systems, 50, 30–40.

See Also

Photo of author
About the author
Mark Altaweel
Mark Altaweel is a Reader in Near Eastern Archaeology at the Institute of Archaeology, University College London, having held previous appointments and joint appointments at the University of Chicago, University of Alaska, and Argonne National Laboratory. Mark has an undergraduate degree in Anthropology and Masters and PhD degrees from the University of Chicago’s Department of Near Eastern Languages and Civilizations.