Statistical Surfaces in GIS

By: Caitlin Dempsey

Last updated:

A statistical surface is any geographic entity that can be thought of as containing a Z value for each X,Y location. ย Digital elevation models being the most well known example, others include gradient, temperature, population, economic potential. ย A statistical surfaceย can be any numerically measurable attribute that varies continuously over space, such as temperature and population density (interval/ratio data). ย These surfaces areย โ€œStatisticalโ€ because Z values are a statistical (e.g. mean or sigma) measure of the features under consideration.

There are two basic types of statistical surfaces:

  1. Continuous; Z values occur everywhere within the area of study
  2. Discrete; Z values occur only at specific locations, but can be summarized (such as number per neighborhood).ย  These discrete surfaces are calculated from โ€œpunctiformโ€ย (point) data which are composed of individuals whose distribution can be modeled as a field (such as population density).

Unknown value areas (i.e. those areas between the measure points) can be estimate in two ways. ย Interpolation looks atย โ€œfilling in the gap by estimating the values of locations for which there is no data using the known data values of nearby locations. Extrapolation looks at โ€œguessing whatโ€™s beyond the edgesโ€ by estimating the values of locations outside the range of available data using the values of known data.

Types of Interpolation

Linear is the simplest method, but it is not very accurate. Linear interpolation works best when data points are uniformly spaced.

Non-linear methods are designed to eliminate the assumption of linearity. There are three types of non-linear interpolation methods: weighting, trend surfaces and Kriging.





Global interpolation looks all the average of all values in the dataset to interpolate an unknown data value point. ย Local interpolation looks at the average of values found within a specified radius of the unknown data point.

Distance Weighted (Inverse Distance Weighted – IDW) usesย the weight (influence) of a neighboring data value is inversely proportional to the square of its distance from the location of the estimated value. ย Distance weighted interpolation assumes that the closer values are to each other, the more likely they are to be affected by one another.

idw
An example of distance weighting to interpolate an unknown value (red dot). The values closest to the unknown value are weighted more heavily than values that are farther away.

Trend Surface interpolation uses a global method andย multiple regression (predicting z elevation with x and y location). ย Conceptually trend surface is a plane of best fit passing through a cloud of sample data points whichย does not necessarily pass through each original sample data point. ย This interpolation is used when the user wants to understand general trends of a surface.

First degree trend surface.
First degree trend surface.

For trend surfaces, theย more complex the surface to be modeled, the more degrees of trend.

trend-surfaces

 

Splines usesย local interpolation. ย Spline interpolationย fits a mathematical function to a neighborhood of sample data points. ย The surface isย a โ€˜curvedโ€™ surfaceย and passes through all original sample data points.

Spline interpolation surface
Spline interpolation surface

Kriging is an interpolationย commonly used for geologic applications. ย Krigingย addresses both global variation (i.e. the drift or trend present in the entire sample data set) and local variation (over what distance do sample data points โ€˜influenceโ€™ one another). ย Krigin looks at three variables:

  1. Drift โ€“ general trend of the surface
  2. Fluctuations โ€“ small, spatially correlated changes in the surface
  3. Noise โ€“ random changes not related to the underlying surface

Problems in Interpolation

Since interpolation is use for predictive modeling, it involves “guessing” at unknown values by using available information. ย If there are too few control points, there may not be enough measured values to have a statistically significant sample. ย Also, the distribution of control points is important. ย More complex surfaces need more sample points than flat or simple surfaces. ย Areas at the edges of the map contain the highest area. ย Therefore, control points should be sampled beyond the area being interpolated and then the interpolated surface cropped back to remove edge error.

See Also

Photo of author
Caitlin Dempsey
Caitlin Dempsey is the editor of Geography Realm and holds a master's degree in Geography from UCLA as well as a Master of Library and Information Science (MLIS) from SJSU.