Parent topic Previous topic Next topic 

The self-organising feature map, SOFM or SOM, is a neural network tool developed by Kohonen (1984,1995). SOMs are principally used to facilitate the visualisation and interpretation of high-dimensional datasets, although it may be applied to address a number of other problems in spatial analysis. Software to support the SOM is available in several GIS and remote sensing packages — e.g. Idrisi, ENVI, TNTMips ― in all these cases primarily as a form of unsupervised map classification. SOM facilities are also provided in some commercial neural network software packages (e.g. the Neural Network toolbox for MATLab), and the SOM Toolbox for MATLab. The latter is available free of charge under the GNU General Public Licence from the Helsinki University of Technology, where it was developed. In this subsection we draw on the documentation associated with the SOM toolbox for our description of the procedures involved.

The SOM map itself (the output space) is formed as a regular grid of cells or units (neurons), for example a 10x8 grid (n=80 units) or a 16x32 grid (n=512 units) ― in fact the grid can be much larger if desired. The choice of grid size, form and topology may be user-definable or pre-specified, depending on the software tools being utilised. As with other forms of neural network modelling, each grid cell is connected to adjacent grid cells (neurons). However, in SOM these connections are structured according to a spatial neighbourhood relation, and this provides the key distinction between SOM and other forms of multi-dimensional analysis of this type. Grid size refers to the number of units or cells to be used, typically arranged as an approximately equal number of rows and columns. Grid form refers to the use of square or hexagonal cells (see Figure 8‑19) ― hexagonal cells may provide a preferable cell neighbourhood arrangement. Grid topology refers to the way in which the grid boundary is handled. Standard topologies supported are sheet (planar grid, hence 4 edges), cylinder (grid with two opposite edges attached when computing neighbourhoods ― 2 edges), and toroidal (grid with both pairs of opposite edges attached when computing neighbourhoods ― no edges).

Each grid cell unit has an associated vector, m, that provides a model of the k-dimensional input dataset, hence the n units seek to provide a model representation of the observations (e.g. multi-band source image pixels, attributes of statistical regions in a country), each of which consists of data across k-dimensions (e.g. feature attributes, spectral bands). These model vectors are also known as prototype vectors or codebook vectors. They can be regarded as a set of points with coordinates reflecting those of the input space or data space. In 1-, 2- or 3-dimensional space the input space and model vectors can be visualised graphically, but with higher dimensions such visualisation is only possible by selecting subsets of the dimensions and plotting these individually.

Figure 8‑19 SOM grids

The SOM performs two functions: (i) the modelling of the source data with a smaller number of vectors that attempt to represent the source data as closely as possible (e.g. minimising some measure of deviation from the observations); and (ii) the production of a topology in the SOM grid whereby similar models are close together and dissimilar models are far apart. The resulting SOM grid is not a geographic map, but a two-dimensional representation or projection of the similarity of the models applied. Typically the SOM is displayed as a grey scale or in some cases a simple coloured grid (rectangular or hexagonal lattice) with light grey or similar colours indicating clusters of similar data, whilst dark or dissimilar colours indicate divisions in the underlying models (in some implementations this representation is inverted, so that darker shades indicate similar classes). The SOM is thus a form of data mining tool, which may be used for this purpose alone, or as a pre-cursor to further analysis and data classification (with cells prototype vectors providing the classes). If the classification is then applied to a source spatial dataset, for example a remotely-sensed image, a map of the countries of the world, or a set of point observations (e.g. medical cases and controls) the procedure can generate a coded geographic map in addition to its functional map. Other forms of non-geographic visualisation are also available, including analysis of the individual SOM unit vectors. Currently SOM procedures are provided as tools for multi-band image classification within a number of software packages, including Idrisi and TNTMips. In the following subsection (Section 8.3.3.2) we provide an example and discussion based on the TNTMips implementation and we take a brief look at the Idrisi SOM facility.

  Back to Top    Back to Home Parent topic Previous topic Next topic