﻿ Conceptual Frameworks for Spatial Analysis > Spatial Relationships > Distance, direction and spatial weights matrices

# Distance, direction and spatial weights matrices

Knowledge of location also allows the analyst to determine the distance and direction between objects. Distance between points is easily calculated using formulas for straight-line distance on the plane or on the curved surface of the Earth, and with a little more effort it is possible to determine the actual distance that would be traveled through the road or street network, and even to predict the time that it would take to make the journey. Distance and direction between lines or areas are often calculated by using representative points, but this can be misleading (as noted in Section 2.1.3, Objects). They can also be calculated using the nearest pair of points in the two objects, or as the average distance and direction between all pairs of points (note the difficulties of averaging direction, however, as mentioned in Section 2.1.2, Attributes).

Many types of spatial analysis require the calculation of a table or matrix expressing the relative proximity of pairs of places, often denoted by W (a spatial weights matrix). Proximity can be a powerful explanatory factor in accounting for variation in a host of phenomena, including flows of migrants, intensity of social interaction, or the speed of diffusion of an epidemic. Software packages will commonly provide several alternative ways of determining the elements of W, including:

1 if the places share a common boundary, else 0

the length of any common boundary between the places, else 0

a decreasing function of the distance between the places, or between their representative points, or their kth nearest neighbors

In Figure 2‑8, below, we show a section of the map of counties in North Carolina, USA. Two of the 100 counties have been highlighted, counties 1 and 3. An extract from a simple binary spatial weights matrix generated by the software package based on ‘Queen’s move adjacency’ (by analogy with the game of chess) is shown below for each of these counties (represented in this instance as a text list for compactness, rather than a sparse binary matrix):

1 3

19 18 2

3 5

25 23 18 10 2

Here, the first line indicates that county 1 (the most westerly of the two highlighted) has 3 neighbors. Line two shows that their IDs are 19 18 and 2. County 3 is shown as having 5 neighbors, 4 of which share a boundary with it (Rook’s move) and one of which (25) appears to share a corner (Bishop’s move: Rook’s move plus Bishop’s move equal Queen’s move) but in fact shares a very short boundary. Instead of pure adjacency a set of spatial weights could have been computed based on the Euclidean distance separating the centroids of the various counties. In this case the matching entries for the weights matrix would have the form:

1 19   0.3290

1   2   0.3138

1 18   0.2535

3 18   0.4985

3   2   0.4077

3 23   0.1738

3 25   0.4475

3 40   0.5202

Here the pairwise connections are listed with the distances between the centroids, and entries may include counties that are not adjacent but perhaps lie within some threshold distance (n.b. distances computed by in this example are not in meaningful units, i.e. not in miles; arc distance must be specified to produce distances in miles).

Figure 2‑8 Spatial weights computation

In such analyses it will be W that captures the spatial aspects of the problem, and the actual coordinates become irrelevant once the matrix is calculated. Note that under Euclidean measure W will be invariant with respect to displacement (translation), rotation, and mirror imaging (reflection). Also note that edge effects may be significant – North Carolina is not an isolated State but has the counties of the State of Virginia to its north. Thus there are counties adjacent to those we have highlighted that have been excluded from our analysis purely because of the way we have selected our study area. Likewise, adjacency models based on immediate neighbors (e.g. rook’s move) have a substantial effect on resulting computations, for example on computed correlation coefficients (see further Dubin, 2008, for a fuller discussion of these effects).