Parent topic Previous topic Next topic 

Kernel density estimation (KDE) may also provide an informative (exploratory) tool for hot-spot and cool-spot identification and analysis. Although not strictly a form of clustering, assignment of points to cells that have greater than a pre-determined density value provides a form of clustering in this case. Figure 5‑19A illustrates the use of KDE for the same point dataset as before (lung cancer cases) using a quartic (finite extent) KDE model (see also, Figure 4‑45, where a Normal or Gaussian kernel has been applied to this dataset).

The pattern shown is largely a reflection of the distribution of population and associated infrastructure (roads etc.) in the region. The density grid illustrated has been overlain with the point set and an ellipse showing the possible relationship between the high density area in the south of the study area with an old incinerator plant (dot in lower left of ellipse) and hypothetical smoke plume. The principal interest in this particular dataset was in the relationship between the incinerator location and another, far rarer, form of cancer, affecting the larynx (Figure 5‑19B).

Figure 5‑19 KDE cancer incidence mapping

A. Lung cancer incidence (controls)

B. Larynx cancer incidence (cases)

A small apparent cluster (4 events identified located very close to the incinerator site in Figure 5‑19B) had been observed and the research sought to establish whether this was a real or apparent relationship. The incidence of lung cancer in this instance was being used as a form of control dataset, on the hypothesis that these data represented an estimate of the distribution of the underlying population at risk, and assuming that there was no relationship between lung cancer incidence and incinerator location (a working assumption only) — see Diggle (1990) and related papers for more details. As can been seen from Figure 5‑19, the overall pattern of the larynx cancer cases seems to follow the pattern exhibited by the lung cancer cases, i.e. to be largely a reflection of underlying variation in the at-risk population. Whether there is a real and unexpected cluster in the neighbourhood of the old incinerator is difficult to determine.

Diggle’s model and tests suggest that the cluster does appear to be significant. But he also notes the sensitivity of the model to the low number of cases — with deletion of just one of these cases there is a reasonable chance the result could have arisen by chance. He also notes the problem of formulating hypotheses based on examining specific apparent clusters. A wide range of comparable regions should be studied, without pre-conceptions, since clusters may well be observed which may or may not be associated with particular facilities (see also the earlier discussion of exploratory cluster hunting in Section 5.2.6).

  Back to Top    Back to Home Parent topic Previous topic Next topic