Living in an age where vast libraries of spatial data, powerful tools, and ways of communicating with multimedia web maps are at our fingertips, should data quality still concern us? I believe that despite these amazing advancements, data quality still matters. I would go a step farther and contend that because of these capabilities, data quality matters more now than ever. Data quality has an enormous impact on mapping and spatial analysis, the perception of the audience consuming the results, and the decisions that result.
I believe there are 10 key reasons why geospatial data quality matters.
1. Maps are easily trusted
Maps tend to be believed. From their early days on clay tablets, silver plates, silk, and wood blocks, maps were always very detailed, labor-intensive affairs. Second, they were often commissioned by rulers and others in power. Third, they were compiled with knowledge that their creators gained from their own extensive fieldwork or gathered by examining the field experiences of others. These three characteristics gave maps an air of authenticity.
Even in our modern digital era, where maps are embedded, incorporated, and refashioned into fantastic forms, many maps still look and feel like authoritative sources, even though they may not always be.
In their essay in The Map Room, the authors point out that bad maps abound in our modern world, such as “the favorite food in each state” or “what people hate about each state”. These maps generate web traffic, which is the chief reason many are created, but the authors rightly ask questions such as “Really? Who says these are the favorite foods?”
On a similar theme, I submit this arsenal of “bad maps” of all kinds in this presentation, which I regularly give in the hopes that people will think carefully about maps as powerful tools that can lead–or mislead. Perhaps even more telling is this set of everyday examples of why we need to be critical of the data, including an odd description of what constitutes George Mason University, the wrong location of the National Academy of Sciences, an IoT feed of a temperature reading that is over 3,000 degrees, and even an error-filled “Beatles” playlist.
2. Maps can be easy to make
Many types of maps are easy to make. I spent much of my own career in federal agencies compiling geographic data and making maps. While national mapping and science agencies are still very active in creating spatial data and maps, the percentage of the total number of maps and data layers authored by these organizations has been decreasing in the flood of hundreds of thousands of web maps made daily.
Nowadays, anyone with access to the web with tools such as ArcGIS Online can create a map, share spatial data layers, and create multimedia story maps and dashboards. Anyone can access hundreds of crowdsourcing tools such as Survey123 and iNaturalist to create mappable data.
Having served on the round-the-clock team at the US Census Bureau working during the 1980s to create TIGER, and creating DEMs and NHD data at USGS during the 1990s, I don’t pine for those challenging early days of GIS. But with increasingly easy ways of creating and serving maps and spatial data, shortcuts can be taken, metadata omitted, and data may not be verified.
Make sure you understand the amazing capabilities of modern GIS, but understand the limitations of your data. Ask critical questions of your data, your methods, and your tools.
3. Maps can make “fun” posts more believable
Maps are often attached to “fun” posts. I’m not trying to be Mr Grumpy Pants here, because some of these posts are funny and interesting. But the danger with some of these posts is they can reinforce the notion of believing stories simply because they are include maps and graphics.
I submit the voyage of the SS Warrimoo in 1899 as an example: It is interesting, but, is the multiple hemispheres, days, and centuries that are supposedly simultaneously experienced by the ship’s crew true as described?
Related: Online COVID-19 Maps and the ‘Infodemic’
4. Maps model reality
Maps are not reality, but only representations of reality. Useful ones, to be sure, but still representations. Consider my deciduous-coniferous tree line example. Is it really a line? Or is it really a zone? And how should we depict uncertainty and scale considerations on mapped data? Furthermore, be critical of imagery as well—it may look like a “true” representation of the surface of the Earth in a specific part of the electromagnetic spectrum, but recognize that imagery can be edited as well for various reasons, and as such, needs to be viewed as critically as other spatial data, as I describe here and also in this essay.
5. We are all potential map makers
Today, we are no longer simply map consumers, but we are all potential map makers. With many ways of creating mapped data, be critical of data–even when it is your own. We all quickly become rather fond of and attached to our own data, but as I describe in this example of a GPS track that I collected and mapped, even your own data needs to be examined rigorously.
6. Thoroughly check your data sources
Go beyond simply reading the metadata to thoroughly checking your data sources. Admittedly this often requires extra homework, some hearkening back to “old school” methods of actually calling the data creator on the phone. Read my example of mapping Lyme disease in Rhode Island: If I had not called the data creators, I would have come to an erroneous conclusion about Lyme disease trends there.
7. Map scale matters
Scale still matters. When you zoom in on your data using your GIS tools, the accuracy of the data does not increase as you do so. This sounds obvious, but I often, when observing people using GIS, get the impression that they think that because they can view the data at 1:1000 scale, that the data are spatially accurate to that scale. Data are collected at specific scales, even though it increasingly is rendered and filtered to show increasing detail at larger scales. See my example of “walking on water” on a pier on Lake Michigan to drive home this point.
8. Truth in labeling
The phrases truth in labeling and fitness for use are still appropriate in this discussion. Truth in labeling refers to the data producer’s responsibility to provide enough information so that the end user will be able to determine if the data is fit for his or her use. Paying attention to truth in labeling is increasingly important in an age where each of us is no longer just a map consumer—we are all potentially map producers.
Each time for example that you run an analysis tool in ArcGIS Online, you create a new layer—a new set of mapped data. And as a data user, instead of asking “is this map or data set “good”, ask is it fit for your use. Are the attributes, collection scale, completeness, curation date, topological relationships, and any copyright restrictions sufficient for you to decide whether the data are fit for use in your project?
9. Verify your data sources
Get into the habit, when describing your results, using the phrase “according to this data”. This will remind you to keep verifying your sources, and will remind your audience that your results and conclusions are highly dependent on the data you are using.
How can data quality be measured? For one set of measures on data quality, see my video describing the CRAAP test–Currency, Relevance, Authority, Accuracy, and Purpose. For others, see the 4 C’s of data quality analysis, FGDC standards, and sources of error, in my presentation here.
10. Pay attention to data quality
Because you are responsible for ethical, wise decision making, data quality affects your decisions and consequently the decisions of those around you. Geospatial technologies are often and rightly described as “powerful.” As I describe in this essay on ethics, “with power comes the ability to cause harm – intentionally or unintentionally – as well as to make a positive impact“. Paying attention to data quality fits squarely into building a smarter, healthier, more sustainable, and more resilient world.
For more details and examples, see my presentation, Why Data Quality Still Matters, Now More Than Ever. Here, I provide information about data quality standards, past and present. See also my article in Directions Magazine, here, and on a regular basis, check in with the Spatial Reserves data blog that I and Jill Clark have been writing for nearly a decade, stemming from our book, The GIS Guide to Public Domain Data. My goal of these writings and videos are to equip the community to remember that data quality does indeed matter.