|
|
Much of the groundwork in spatial statistics is concerned with the description and exploration of spatial datasets. The generic term for such methods is exploratory data analysis (EDA), or in the context of spatial and spatio-temporal analysis, ESDA and ESTDA respectively. Such methods are by no means exclusively statistical in nature, and for ESDA special forms of data mapping (i.e. visualisation) are of considerable importance — commercially available EDA or data mining tools do not generally provide spatial visualisation. This section provides a brief introduction to some of the methods that are specifically spatial in nature (ESDA) and which are supported in readily available software products.
The simplest form of EDA involves the computation of basic statistics, and in the context of spatial data, statistical summaries of attribute tables and grid values. A useful online reference on EDA is the NIST e-Handbook, referred to in the Suggested reading section of this Guide. Graphical analysis of such data tend to be histograms, pie charts, box plots and/or scatter plots. None of these provides an explicitly spatial perspective on the data. However, where such facilities are dynamically linked to mapped and tabular views of the data they can provide a powerful toolset for ESDA purposes. The selection of objects through such linking may be programmatically defined (e.g. all values lying more than 2 standard deviations from the mean) or user defined, often by graphical selection. The latter is known as brushing, and generally involves selection of a number of objects (e.g. points) from a graphical or mapped representation using a dragged region, generally of rectangular shape. Facilities of this type are implemented in a number of GIS packages, notably in ArcGIS V9 (with a range of tools for different data types, but limited to 300 points for selected ESDA tools such as semivariance analysis) and in the stand-alone package, GeoDa. The latter has been built using ArcGIS objects and reads and writes ArcGIS shape files. It limits its attention to lattice data, by which is meant discrete spatial units (zones/areas) rather than point sets or point samples from a continuous surface.
Extensions to a number of these techniques to the spatio-temporal domain (ESTDA) have recently been made available in a number of software packages. These include the STARS open source project, BioMedware’s (commercial) Space-time Intelligence System (STIS) and the National Cancer Institute’s SaTScan software, available free of charge, from http://www.satscan.org/. A recent publication, with a specific focus on spatial and spatio-temporal EDA is Andrienko and Andrienko’s “Exploratory Analysis of Spatial and Temporal Data”. The authors are associated with the CommonGIS Java-based software project that enables ESTDA to be delivered over the Internet ― interactive Java-based examples from Andrienko et al. (2003) are included on the CommonGIS web site. They view the purpose of ESDTA as providing a data focus in which:
· peculiarities of the data can be revealed and an appreciation obtained as to how the data should be further processed (e.g. filtered, transformed, split, combined, …)
· hypotheses can be generated for further testing (e.g. using statistical methods), and
· proper methods can be selected for in-depth analysis of the data
They make the case that space and time must be seen as complementary views of the same data. This leads on to need for systematic analysis of: (i) the evolution of spatial patterns in time; and (ii) the distribution of temporal behaviours in space. Because spatio-temporal datasets are complex and often incomplete and inconsistent they recommend that such data are divided up and explored by slices and subsets (species, age groups, countries, years etc.), and that care is taken to examine outliers and unexpected patterns. This approach is especially important for data obtained from non-governmental sources, e.g. much of the data obtained via the Internet.
|
|