Parent topic Previous topic Next topic 

Crimestat provides a general purpose form of clustering based on nearest neighbour (NN) distances. This form of clustering can be single-level or multi-level hierarchical (NNh) and is of particular applicability if nearest neighbour distance is believed to be of relevance to the problem being considered. Events are considered to be a member of a level 1 cluster if they lie within the expected mean distance under CSR plus or minus a confidence interval value obtained from the standard error plus a user-definable tolerance. These parameters effectively define a search radius within which point pairs are combined into clusters. A further constraint can be applied, specifying the minimum number of events required to constitute a cluster. The mean centre and standard deviational ellipses for these clusters are calculated and may be saved in various GIS file formats. These mean centres are then regarded as a new point set, and are subjected to the same type of clustering in order to identify and generate second order and ultimately higher orders of clustering. Clearly the number of points in the initial sample and the degree of clustering have a major bearing on the way in which such clusters are identified. Figure 5‑18 illustrates the results of applying this process for the lung cancer data shown earlier.

Crimestat also provides a variation on this clustering procedure to account for background or “baseline” variation. It describes the procedure as a risk adjusted NNh method, or RNNh. The background data is represented as a fine grid using kernel density estimation, and this is used to adjust the threshold distance for clustering the original point set, on a cell-by-cell basis (see also, Section 5.4.3.3). Note that with both NNh and RNNh not all events are assigned to clusters, and each point is assigned to either one cluster at a given hierarchical level or none at all.

Figure 5‑18 Lung cancer NNh clusters

  Back to Top    Back to Home Parent topic Previous topic Next topic