Crosstabulated grid data, the Kappa Index and Cramer’s V statistic

Navigation:  Data Exploration and Spatial Statistics > Grid-based Statistics and Metrics >

Crosstabulated grid data, the Kappa Index and Cramer’s V statistic

Previous pageReturn to chapter overviewNext page

The creation and analysis of crosstabulated data is a familiar process in many fields and often the subject of simple statistical analysis, principally through the use of Chi-square tests. These are covered in all basic statistics texts, but for completeness we will include a brief description here, before showing how such methods may be applied to specifically spatial problems (see further, O’Brien, CATMOG 51, which describes both simple 2-way tables and more complex analysis applied to 2-way and multi-way, incomplete and asymmetric tables).

A common crosstab arrangement is a table of rows, representing distinct treatments (e.g. A=no soil improvement, B=organic manure treatment, and C=non-organic treatment) tabulated against responses (R) or outcomes (e.g. classified levels of crop production, or presence/absence of a given pest or disease; Table 5‑5). Tabulated cell values are counts of events, e.g. the number of treated plots falling into each response category. This kind of 2-way presentation of data is often analyzed to identify whether the observed frequencies in each cell are significantly different from those that might be expected under the assumption that the outcomes are independent of the treatments — i.e. the row and column classifications are independent. 

Let X be an N-row by M-column crosstabulation of frequency counts, xij, with overall total Σxij=x..=T (the subscript dots indicate that we have summed across all subscript values — see Table 5‑5, sometimes summation subscripts use a + symbol rather than a dot, as in xi+). Now let p=pi. be the proportion of the overall total T found in row i, and q=p.j be the proportion of T found in column j, then the expected frequency in cell (i,j) under the assumption of independence, i.e. the treatments do not affect the responses, is simply Tpq.

Table 5‑5 Simple 2-way contingency table

 

R1

R2

R3

Totals

A

x11

x12

x13

x1.

B

x21

x22

x23

x2.

C

x31

x32

x33

x3.

Totals

x.1

x.2

x.3

x..=T

The difference between the observed (O) and expected (E) frequencies provides a measure of how close the observations are to a pattern that might arise assuming the rows and columns are independent. To remove the sign of these differences the values are squared, and then are standardized by dividing through by the expected frequencies, and finally summed. This provides an overall measure of differences that under a broad range of conditions follows the Chi-square (χ2) distribution with (N-1)(M-1) degrees of freedom. The computation of the statistic is often shown in the form:

or

If the computed value is large then it is less likely that the rows and columns are independent than if it is small. The probability that a particular computed value might have arisen by chance, given the size of the table (as represented by the degrees of freedom) can be obtained from tables of the Chi-square distribution or computed using built-in functions in many software packages, including Excel. Results in this analysis are also adversely affected by small counts and as a rule of thumb no more than 20% of cell values should be less than 5. In such cases it may be desirable to aggregate columns or rows (if such aggregation remains meaningful), or to adopt an alternative test procedure (e.g. simulation/permutation tests).

Within spatial analysis this approach to studying crosstabulated datasets has been utilized to provide an insight into classified spatial datasets — either matching pairs of images that are timeslices, or remote sensing imagery and ground truth datasets. The proportion, p, of cells in image 1 that match those in image 2, is the principal measure of interest. If p is close to 1 then the two images are likely to be very similar. The detailed pattern of differences can be interpreted from the crosstabulation of the classes and/or by generating a new image in which either every classification combination is presented, or binary coding is used to indicate the location of matching/unchanged cells (0) and non-matching/changed cells (1).

The crosstabulation (or cross-classification procedure) described in Table 5‑4, results in a potentially large table comprising rows representing the classification of the cells in image 1 and columns showing corresponding classifications in image 2. Cell entries are then counts of observed combinations for co-located cells. In the context of automated classification using training data this crosstab is sometimes referred to as a confusion matrix and the diagonal elements as a proportion of the row totals provide an indicator of the amount of confusion (100% would imply no confusion).

The cross-classification table may also be used to generate an index that describes the overall (global) similarity between the two images (a form of correlation measure). The two index values that are commonly computed (e.g. in Idrisi, ENVI and similar software) are the Kappa Index of Agreement and Cramer’s V index. Both have a range of values that typically range from 0 to 1, with 1 indicating perfect agreement and 0 indicating a pattern arising by chance. Negative values are possible and occur where the proportion of matching cells is low. Both index measures are calculated using procedures developed from the Chi-square analysis of standard contingency tables.

The Kappa Index of Agreement (also known as KIA), is of the form:

where O is the observed accuracy or proportion of matching values (the matrix diagonal) and E is the expected proportion of matches in this diagonal assuming a model of classification independence derived from the observed row and column totals. Hence O is simply the sum of the diagonal elements divided by the overall total, T, and E is computed in a similar manner to the expected values for a Chi-square calculation, with each element being summed. Thus E is the sum for each diagonal cell of the row total times the column total divided by the overall total squared. If xij represents the observed entry in row i, column j, then row totals are given by xi., column totals are given by x.j and the overall total is given by T=x.. Thus the expected values are obtained from the row and column proportions, pi.=xi./T and p.j=x.j/T, giving eij= pi.*p.j

The Kappa index may be disaggregated into class or “per category” components by examining the row-wise expectations:

Cramer’s V statistic is similar to the Kappa index and is derived directly from the Chi-square statistic computed for a given crosstabulation (or contingency table). As with the Kappa statistic its value is reported in the range [0,1], with the same interpretation, but in this instance the source table can be an MxN array, thus the classification schemes do not have to be identical. The statistic is computed as: