***





Spatial DBs

 

 

 

 

 

Objectives/TOC

What is a spatial database?

"A spatial database is a database that is optimized to store and query data related to objects in space, including points, lines and polygons."

In other words, it includes objects that have a SPATIAL location (and extent). A chief category of spatial data is geospatial data - derived from the geography of our earth.

Characteristics of geographic data:

Geographic data is NOT 'business as usual'!


Entity view vs field view

In spatial data analysis, we distinguish between two conceptions of space:

For our purposes, we will adopt the 'entity' view, where space is populated by discrete objects (roads, buildings, rivers..).

Components

So a spatial DB is a collection of the following, specifically built to handle spatial data:

Soon, we will explore what types, operators and indices mean.

Examples of spatial data

CAD data:

Agricultural data:

3D data:

What can be plotted on to a map?


Who creates/uses spatial data?

Various government agencies routinely coordinate spatial data collection and use, operating in effect, a national spatial data infrastructure (NSDI) - these include federal, state and local agencies. At the federal level, participating agencies include:

As you can see, spatial data is a SERIOUS resource, vital to US' national interests.

Where does spatial data come from?

Spatial data is created in a variety of ways:

What to store?

All spatial data can be described via the following entities/types:

Points, lines, polys => models and non-spatial attrs

Once we have spatial data (points, lines, polygons), we can:

Look at this map, overlaid with scary data..

SDBMS architecture

GIS vs SDBMS

GIS is a specific application architecture built on top of a [more general purpose] SDBMS.

GIS typically tend to be used for:

Spatial relationships

In 1D (and higher), spatial relationships can be expressed using 'intersects', 'crosses', 'within', 'touches' (these are T/F predicates).

Here is a sampling of spatial relationships in 2D:

Another diagram showing the [binary] operations:

Minimum Bounding Rectangles (MBRs) are what are used to compute the results of operations shown above:

Spatial relations - categories

Spatial relationships can be:

Topological relationships could be further grouped like so:

How can we put these relations to use?

We can perform the following, on spatial data:

Spatial operators, functions

This doc [from 'FME Knowledge Center'; thanks to Minaxi Singla for the link] provides more info on the spatial operators.

Oracle Spatial

Oracle offers a 'Spatial' library for spatial queries - this includes UDTs and custom functions to process them.

Postgres PostGIS

Here is an example - table creation, and polygon insertion:

To do the above, here are the steps on a PC (similar steps on a Mac):

You can learn a lot about spatial queries from this page.

Creating spatial indexes

As (more so than) with non-spatial data, the creation and use of spatial indexes VASTLY speed up processing!

Can B Trees index spatial data?

In short, YES, if we pair it up with a 'z curve' indexing scheme (using a space-filling curve):

The idea is to quantize every (x,y) location into a recursively-divided 'quadtree' cell, and use the cell's binary (x,y) location to create a (binary) 'z' key, which is ordered along the unit (0..1) interval - in other words, 2D (x,y) points get mapped (indexed) to ordered 1D 'z' locations.

But, this is of academic interest mostly, not commonly practiced in industry - Apple's FoundationDB is an exception.

R trees

R trees use MBRs to create a hierarchy of bounds.

Variations, FYI: R+ tree, R* tree, Buddy trees, Packed R trees..

k-d trees, K-D-B trees

k-d tree

Alternate: K-D-B tree:

Quadtrees (and octrees)

Here we recursively and adaptively subdivide space [subdivisions happen only where necessary].

Each node is either a leaf node, with indexed points or null, or an internal (non-leaf) node that has exactly 4 children. The hierarchy of such nodes forms the quadtree.

Indexing evolution

Indexing schemes continue to evolve.

Query processing: filter, refine

Visualizing spatial data

A variety of non-spatial attrs can be mapped on to spatial data, providing an intuitive grasp of patterns, trends and abnormalities. Following are some examples.

Dot map:

Here's another one.

Proportional symbol map:

Diagram map:

Another diagram map:

Also possible to plot multivariate data this way.

Choropleth maps (plotting of a variable of interest, to cover an entire region of a map):

So who (else) has spatial extensions?

Everyone!

Thanks to SQL's facility for custom datatype ('UDT') and function creation ('functional extension'), "spatial" has been implemented for every major DB out there:

Google KML

Google's KML format is used to encode spatial data for Google Earth, etc. Here is a page on importing other geospatial dataset formats into Google Earth.

OpenLayers

OpenLayers is an open GIS platform.

ESRI: Arc*

ESRI is the home of the powerful, flexible family of ArcGIS products - and they are local!

QGIS etc.

There is a variety of inexpensive/open source mapping platforms, competing with more pricey commercial offerings (from ESRI etc). Here are several:
QGIS
MapBox
Carto
GIS Cloud