﻿ Network and Location Analysis > Introduction to Network and Location Analysis > Source data

# Source data

A set of line segments or polylines, such as a road or street map, does not typically constitute a network. In order to conduct many forms of network analysis it is necessary to ensure that a set of distinct nodes or vertices (V) exist and between each vertex one or more links to other vertices are defined. These links are referred to as edges (E). At the simplest level edges may be directed (e.g. one way) or undirected (two-way) and may have one or more attributes associated with them (e.g. name, length, travel time/cost, mode of transport).

Alternatively, one can simply start with a point set in the plane representing network vertices and no edges. In this case a set of links between vertices is to be constructed, typically using straight line (Euclidean) segments that satisfy certain criteria. This can be considered as a form of network construction problem or simply a routing problem that ignores the details of any underlying fine network structure.

Many polyline spatial datasets that appear to satisfy the requirements for network analysis in fact do not. This is often because the required vertices do not exist as objects and/or the set of lines are not precisely linked — the endpoint of one polyline does not precisely coincide with the start or endpoint of other polylines in the set. Furthermore, where segments of polylines cross the nature of the intersection needs to be defined — for example: is the intersection to be regarded as a vertex?; or should we regard one line as passing over or under the other? The simplest arrangement is one that assumes all such intersections constitute valid network vertices, and that all lines whose endpoints are closer to each other than some specified tolerance value (typically a small Euclidean distance measure) are, in fact, coincident and constitute a network vertex.

With these assumptions made (or variants of these) many GIS packages will convert a set of line data into a network, i.e. effectively imposing a network structure or topology on the dataset. Edge and/or vertex labeling may form part of this process, enabling subsequent routing plans to be referenced by route or street name and flow direction. Having done so, and/or as part of this process, attributes may assigned to edges and vertices individually, across selected subsets or globally. Examples include: (i) definition of turn attributes at intersections (permitted, not-permitted by link and any penalty, such as additional time or cost associated with such a turn); (ii) definition of U-turn permissions; (iii) definition of weights (or impedances) to be associated with edges (typically transition times, often defaulted to use edge length as a surrogate) — in some cases different definitions may be applied for each direction traversed along the edge, in others the link may be modeled as two separate edges, one for each direction of travel; (iv) definition of one-way edges and their direction; (v) specification of any permanent or temporary barriers, typically at vertices — links that are barred are often simply disabled or assigned a very high impedance value; and (vi) defining demand and capacity constraint levels (edge and/or vertex based).

Once such a network has been defined, attempts at optimal routing can be made — for example finding the shortest (or fastest) path between a given pair of vertices such as an accident location and an ambulance station or stand-by point. However, as noted earlier in this Guide, street network and similar physical infrastructures do not necessarily satisfy the requirements of a full metric space, and it may be the case that some points are simply not reachable from others, or that optimal routing can only be approximated and/or remain difficult or impossible to prove to be optimal. Again, the criterion one adopts in such cases is fitness for purpose — best known approximations, which in some instances are provably within a certain closeness to optimal solutions, are often the options that have to be taken. These can then be evaluated against operational criteria (e.g. maximum acceptable time for an ambulance to arrive at an incident for 99.8% of call-outs) and the problem or criteria re-evaluated if such criteria cannot be met. One of the great advantages of many of the toolsets available is the ability to run ‘what if’ scenarios, experimenting with alternative constraints, flows, changes in demand or supply, etc., and evaluating the varying costs or times resultant on these variations. This information provides a form of sensitivity analysis, which may (for example) enable depot location decisions to be made taking into account variations in availability of suitable sites, access routes and labor, forecast temporal variations etc., within areas of broadly similar generalized costs.

With pure point-based data no pre-existing network is assumed, and problems relate to determining the form of network (set of edges) or path that connects the point set in some particular manner. This assumes that either no underlying physical network exists and travel between vertices is possible and uniform across the plane or that an underlying transport infrastructure exists and is sufficiently fine and direct to allow its specific form to be ignored for the problem at hand. This latter assumption may or may not be entirely realistic, but has the advantage of enabling difficult problems to be addressed without the need for the additional burden of large numbers of network-based computations. Finally, depending on the problem being addressed, neither point (location) nor line (network) data are required — a pre-computed inter-point distance or cost matrix may be sufficient. However, this is not generally sufficient to re-construct a unique map of the locations, nor will real-world routes chosen be represented — only the sequence of links and associated distances/costs will be available.