Research into problems of optimal facility location has a long history, with substantial progress being made from the 1960s onwards. As the range of problems tackled became broader and solution procedures required increasing use of computational power, libraries of software were developed and made available by academics, which then (directly or indirectly) were incorporated into commercial software packages. One of the first of these was the set of FORTRAN programs published (as an edited monograph and in electronic form) by Rushton, Goodchild and Ostresh in 1973. The problems addressed in this software collection included single- and multi-facility location in the plane (Euclidean metric) and on a network, together with algorithms for allocation and spatial choice modeling. At this time the single facility optimization procedures were exact, whilst multi-facility procedures utilized a variety of heuristics to obtain “good” solutions.

In the plane (i.e. with no network defined) the p-median problem is a form of point-set partitioning, similar to cluster analysis. Typically p supply or service points are sought to service n customer locations (vertices), such that each customer is allocated to a single supply point (often based on simple Euclidean proximity) and such that the total distance or cost of transport between the customers and the supply centers is minimized. If p=1 we have the simple computation of the Minimum Aggregate Travel (MAT) location, which may be determined using the iterative procedure described in Section 4.2.5, Centroids and centers. If p>1, however, the problem is more complicated and requires an extended solution procedure. Exact solution of the p-median problem in the plane is possible using branch and bound (tree-searching) algorithms. Ostresh, for example, provides such an algorithm in the FORTRAN suite referred to above (Rushton, Goodchild and Ostresh, eds., 1973). Exact solutions for larger problems and extended variants of the problem may be computationally demanding and several heuristic procedures have been devised that tests suggest will often provide optimal or near optimal solutions (with less than 3% deviation from the optimal objective function value) in many cases. Note that in planar or “free” space problems the form of the objective function can be viewed as a cost surface, whose shape may be quite complex with many small local optima. Furthermore, since sections of this surface may be relatively flat, a specific heuristic solution that is within 2% of the optimum objective function value could well be defined by a set of locations that are a considerable distance from the optimal sites. Evaluation of variation of the objective function in the neighborhood of solution points one at a time (e.g. across a grid defined within the Voronoi region of this point), can provide an insight into the form of this surface.

A simple heuristic solution for the planar p-median problem, most commonly attributed to Cooper (1963, 1964), is as follows:

Step 1: randomly select p points in the Minimum Bounding Rectangle (MBR), or preferably within the convex hull, of the customer point set, V, as the initial locations for the median points

Step 2: allocate every point in V to its (Euclidean) closest median point. This partitions V into p subsets, Vp

Step 3: For each of the p subsets of V, compute the MAT point using the iterative equation referred to above (described in Section 4.2.5, Centroids and centers)

Step 4: iterate steps 2 and 3 until the change in the objective function falls below a preset tolerance level

Clearly the quality of the solution obtained by this heuristic is not provided. One approach to improving the solution found would be to repeat the process k times with a different set of random starting points and compare the final objective value with previous runs. A better approach is to make good initial choice of the median locations, for example by some form of simple cluster analysis (e.g. the density based procedure adopted by Crimestat or a simple neighbor-based or quadrat-derived clustering). An extended version of this algorithm can be used to solve the same problem, but with capacity constraints on the supply centers. In this case allocation is more complicated and may result in some customers being served by more than one center. Experience suggests that constrained problems of this kind often yield solutions that are closer on average to the true optimum than unconstrained problems.

Examples of the development of collections of location modeling utilities include the public-domain C++ library known as LOLA (Library of Location Algorithms), Daskin’s SITATION software, and the S-Distance project developed at the University of Thessaly in Greece (a Visual Basic V6 implementation of many discrete and network location algorithms, including a wide range of heuristics, although not yet fully completed — see further, subsection 7.4.2, Larger p-median and p-center problems). Figure 7‑14 illustrates using LOLA to identify the best two cities in Italy from which to service demand from customer sites in a total of 39 cities. Each city included (network vertex) is taken as a customer site, with entries being of the form (x,y,z,n) where x and y are coordinates, z is the customer demand estimate and n is the city name — for example:

28.00 74.00 50000 [Milano]

Coupled with this information are details on the network structure and cost. In this case data are entered as adjacency lists, in the form <from,to,weight>.

Hence as Milano is location 1 and has direct connections to Piacenza, Novara and Verona (locations 34, 17 and 13) the adjacency list is of the form:

1 34 4; 1 17 3; 1 13 22

The weights in this case might be travel or transport costs or times. The geometry of the routes in this case is unimportant. The optimum solution consists of two city locations, shown here as Opt1 and Opt2 (Piacenza and Roma). This is a very simple example but serves to illustrate the ideas involved and the basic data required. In many instances the problem LOLA addresses is where to locate a new facility given that a number of such facilities already exist. New facilities typically involve an overall construction cost (cost of site plus construction etc.).

Figure 7‑14 Optimum facility location on a network — LOLA solution

The majority of GIS packages do not provide such capabilities directly, although many offer basic operations, such as locating a weighted mean center in the plane. TransCAD provides tools to add a number, m, of new facilities to an existing set (which could be empty) in a network environment, and thus has similar functionality to LOLA in this example, but here it is integrated within a GIS framework rather than linked to a separate product set. It provides tools to determine the best location to place one or more new facilities, or as many as might be required to meet a given service objective, subject to various optimization objectives (minimization of average costs, minimization of maximum costs, maximization of minimum costs). Custom-built software utilizing general purpose optimization libraries such as LEDA, CPLEX and Xpress-MP enables a wide range of problems of this type to be designed and solved in an optimal or near optimal manner. In many cases larger problems require the use of heuristic or approximation algorithms in order to obtain solutions within a reasonable time.