The Douglas-Peucker Algorithm for Line Simplification:
re evaluation through visualisation

by  © M Visvalingam and J D Whyatt, 1990
CISRG Discussion Paper Series No 8, University of Hull, 19 pp

For the sake of posterity, please cite from the published version:
Visvalingam, M and Whyatt, J D (1991) "The Douglas-Peucker algorithm for line simplification : re-evaluation through visualization" Computer Graphics Forum 9 (3), 213 - 228.

CONTENTS

Abstract
1.  Introduction   
2. 
Background
3.  Observations and Discussion
    ▪ Weeding, Cleaning and Simplification
    ▪ Differing objectives of generalisation
    ▪ Strategies for coping with unbalanced simplifications
    ▪ The concept of a fixed rank order of critical points
    ▪ Robustness of the method
    ▪ Distortion of the shape of features
    ▪ The concept of critical points
4. Conclusions  
Acknowledgments
References

 

ABSTRACT

The primary aim of this paper is to illustrate the value of visualization in cartography and to indicate that tools for the generation and manipulation of realistic images are of limited value within this application. This paper demonstrates the value of visualization within one problem in cartography, namely the generalisation of lines. It reports on the evaluation of the Douglas-Peucker algorithm for line simplification. Visualization of the simplification process and of the results suggest that the mathematical measures of performance proposed by some other researchers are inappropriate, misleading and questionable.

 

 

1.  Introduction 

  • The primary aim of this paper is to illustrate the value of visualization in cartography and to suggest that concept refinement will be assisted more by the development of graphical information systems than by systems for achieving realism. A detailed description of the evaluation of the Douglas-Peucker algorithm for line simplification demonstrates how spatial reasoning is expedited by the cross referencing of elementary graphics in different displays.

    The Report of the Panel on "Graphics, Image Processing and Workstations", sponsored by the Division of Advanced Scientific Computing (DASF) of the U S National Science Foundation (NSF), advocated the view that "the purpose of [scientific] computing is insight, not numbers" (McCormick et al, 1987, p3). It believed that "The most exciting potential of wide spread availability of visualization tools is not the entrancing movies produced, but the insight gained and the mistakes understood by spotting visual anomalies while computing" (McCormick et al, p 6).

    In theory, advances in visualization can be of immense value to cartography and its applications, especially to the rapidly developing and commercially important area of Geographical Information Systems (GIS). Cartography is the art, science and technology concerned with the exploration, interpretation and communication of information about spatially distributed phenomena and their relationships through the use of maps. Digital cartography is the technology underpinning the construction and use of computer based systems for the practice of cartography and its applications (Visvalingam, 1990). The aims and diverse scope of digital cartography were reviewed elsewhere (Visvalingam, 1990). Interactive visualization of spatial data requires that the complimentary sub fields of digital and visual mapping become integrated within Graphical Information Systems. Digital mapping is concerned with the design, population, management and performance of large integrated spatial databases while visual mapping is increasingly concerned with the visual exploration of patterns and relationships in spatial data and the graphic articulation and communication of relevant data, distilled information and ideas.

    Whilst many of the developments in computer graphics hardware and software will stimulate new directions of growth in cartography, tools for achieving realism will benefit relatively few cartographic applications. Indeed, the process of adding calculated realism to abstract descriptions of objects and scenes is contrary to the aims of cartography. The latter employs a variety of transformational processes to derive abstractions about reality and describe these through graphic symbolism. Like mathematics, the formal language of cartography employs symbolic notation.

    The generalisation of information involves one such set of transformational processes (Robinson et al, 1984). Generalisation is concerned with the creation of simplified representations of data. The map is a graphic précis of reality. Often, it is a grossly reduced representation of reality and it is insufficient merely to scale down the graphic representation since this would result in clutter and detract from the map's primary role as a communication medium. Cartographic generalisation includes processes, such as selection, simplification, symbolism, classification, displacement and exaggeration. Cartography is more concerned with discovering and communicating the quintessence of reality rather than with achieving a photographic record of the real thing. It is often used to communicate entities which exist by virtue of dictum, such as boundaries, and to explore and understand concepts and phenomena which are not directly observable; i.e. to see the unseen. It is pertinent to point out that air Photographs are not classed as maps, but that rectified photographs, to which names, symbols, grid lines and/or mathematical information have been added, are accepted and defined as photomaps (ICA, 1973, p 315) since these additions alter and guide our perception of the image.

    Consequently, Sasada's (1987) work on the "realism of drawing" has a great deal more value and relevance to cartography than the advances in realistic rendering of objects and scenes. Although cartography is cited as one application in McCormick et al (1987, p A-8), the accompanying SIGGRAPH videotape (Herr and Zaritsky, 1988) is largely concerned with the technology behind the entrancing movies which have been produced. There is no doubt that visualization can stimulate insight and understanding; but, relatively basic computer graphics is quite adequate for this purpose. It is much more important that the technology is appropriate for the task in hand. Spatial data tends to be multidimensional and the step up from 2D to 3D visualization is not sufficient to overcome the Problems of visualizing multidimensional data.

    Visvalingam and Kirby (1984) and Visvalingam (1985) had previously suggested that developments in workstation technology, concurrency, and window management systems based on UNIX, offered some prospects for overcoming the problems of hyper dimensionality and provided a detailed account of how cross referencing elements in a set of displays led to insights about socio spatial data. What is needed for data exploration is something akin to, but a great deal more sophisticated than, hypertext for cross-referencing graphic depictions of data in a flexible manner. This requires an integration of the technologies of computer graphics, human computer interaction and database methods.

    In this paper, we describe another application which requires relatively unsophisticated graphics but which would benefit from a more convenient environment for visual exploration of data. Manual generalisation of maps, like many other cartographic processes, is a skilled activity which relies upon subjective and intuitive decisions. Hitherto, research into the automation of cartographic processes has been directed largely at the development of heuristics for sub-tasks in map generalisation, map design, name placement, map compilation and so on. Given appropriate visualization tools (3D visualization appears to be inappropriate for this purpose) many more case studies, such as the one described here, could be undertaken to further knowledge and understanding and theory formulation.

    The case study reported here concerns the evaluation of the Douglas-Peucker algorithm (1973), which is widely used for line simplification in cartography. The first section of this paper introduces the problem and the algorithm and reviews some previous evaluations by other researchers. A systematic evaluation of the algorithm is presented in the second section, which also questions interpretations of the results of mathematical evaluations. The concluding section provides a summary of our observations.
  • 2.  Background

    Automated generalisation remains a research goal. Research effort to date has focused upon specific problems. This paper is concerned with one of these problems, which has attracted a great deal of research attention, namely the simplification of isolated lines stored in vector format. Line simplification may be defined as the process of removing unwanted detail from lines, and is often necessary when detailed data, captured at large scales, are to be viewed at reduced scales or on small format screens. It is useful for reducing display times, which is highly desirable in interactive applications, and also for removing perceived clutter from
    lines, which is essential for effective visual communication. In thematic mapping simplification may be used to diminish background information, such as coastlines and administrative boundaries, in order to emphasise foreground themes such as the distribution of some population or environmental characteristic.

    For the purposes of this paper, line simplification algorithms may be classed into one of two types. Filtering algorithms retain a subset of the original points whereas other algorithms, including smoothing algorithms, seek to "fit" a smaller set of new points to the original set. The Douglas Peucker algorithm (Douglas and Peucker, 1973) remains one of the most widely used filtering algorithms today. It is the only line simplification algorithm supported by certain mapping packages including MAPICS (MAPICS Ltd., undated) and GIMMS (Waugh and McCalden, 1983), and by on line tutorials such as the Geographical Information Systems Tutor 'GIST!' (Raper and Green, 1989). Often, it is the only line simplification algorithm to be described in textbooks concerned with the subject of computer cartography.

    The algorithm functions in the following manner. The start and end points of the line are connected by a straight line segment. Perpendicular offsets for all intervening points are then calculated from this segment, and the point with the highest offset is identified. If the offset of this point is less than some pre-defined tolerance, then the straight line segment is considered adequate for representing the line in simplified form. Otherwise, the point is selected, and the line is subdivided at this point of maximum offset. The selection procedure is then recursively repeated for the two parts of the line until the tolerance criteria is satisfied. Selected points are finally chained to produce a simplified line. Ballard's (1981) strip tree representation of lines is based on this recursive subdivision of lines. For a more detailed description of the algorithm, see Whyatt and Wade (1988).

    Different approaches to the evaluation of line simplification algorithms have been adopted over the years. Workers such as Marino (1979) and White (1983) have studied the relative merits of different algorithms in perceptual terms, whilst workers including McMaster (1983, 1986, 1987a, 1987b) and Muller (1987b) have evaluated algorithms using mathematical
    metrics. Their observations with respect to the Douglas-Peucker algorithm are reviewed below.

    Marino (1979) observed that cartographers and non cartographers alike displayed a high level of agreement on the selection of perceptually important points on lines. Her observations corroborated Attneave's (1954) proposition that information about a line was concentrated around points of greatest angular deviation, which were called critical points. Later, White (1983) compared lines produced by simplification algorithms with lines that had been simplified manually. Her results indicated that the Douglas-Peucker algorithm produced lines which appeared perceptually most similar to those that had been simplified manually. White also discovered that the Douglas-Peucker algorithm was most effective at selecting critical points; the number of critical points retained being a measure of the perceptual quality of the line. 45% of the original set of points were selected by both the study participants and the Douglas-Peucker algorithm. Her conclusion, that the Douglas-Peucker algorithm was most effective at selecting critical points, suggests that the algorithm was most effective at selecting points along the source line that marked significant changes in direction. Williams (1987) also believed that the algorithm produced good caricatures of lines, because it tended to include maxima and minima.

    McMaster (1986) developed thirty mathematical measures in order to evaluate different line simplification algorithms. In order to test for redundancy within the thirty measures, McMaster (p 106) used the Douglas-Peucker algorithm because he considered it to be "the most cartographically sound". In a more recent paper (1987a), he evaluated nine line simplification algorithms using six of the original thirty measures. The Douglas-Peucker algorithm performed consistently well at all levels of simplification. This prompted him finally to claim that the algorithm was both "mathematically and perceptually superior" (McMaster, 1987b, p108). Mathematically, it produced the least areal and vector displacement from the original line, and best preserved the angularity of the original line, whilst perceptually it tended to select critical points closest to those selected by humans.

    Many of the above perceptual and mathematical evaluations were based on relatively simple test data; but, their conclusions have prompted other workers (Buttenfield, 1986, Jones and Abraham, 1987) to apply the Douglas-Peucker method to purposes other than simplification. Jones and Abraham, for example, attempted to use the Douglas-Peucker algorithm to distribute points into scale related levels, in order to create a scale independent database. Williams (1987) produced algorithms for ensuring that the relative areas of polygons, whose boundaries had been subjected to Douglas-Peucker simplification, corresponded to the original proportions.

    Our re-evaluation of the algorithm was prompted by the visually unacceptable results obtained from the method when applied to various sections of the British coastline. Other workers have already expressed dissatisfaction with the algorithm (Morrison, 1975, Dettori and Falcidieno, 1982, Van Horn, 1985, Monmonier, 1986, Muller, 1987a, Thapa, 1988). For example, the problem of a line crossing itself on simplification has either been ignored, since the knots are not detectable at small scales, or manually corrected. It has also been noted that the method tends to preserve spiky details.

    Muller (1987b) compared seven algorithms using the fractal dimensionality of lines as a statistical measure of line complexity. He concluded that the fractal dimensionality of a line was best preserved by the Douglas-Peucker algorithm after simplification. However, the simplified line appeared very spiky and caricatural. Muller, therefore, questioned McMaster's conclusions. The angularity of the original lines was best preserved by this algorithm, but since the angularity measure was strongly influenced by the presence of spikes, the preservation of angularity could not be regarded as a good indicator of the quality of simplification. Muller, like McMaster, also noted that the algorithm minimised the total areal displacement between the simplified line and the original line, but questioned the value of McMaster's metric, since it could not be used to determine whether an algorithm was capable of preserving the geometric shape of the original line. Two entirely different geometric shapes could result in the same amount of overall displacement (Muller, 1987b, p 31). Muller's use of fractal dimensionality as an evaluative measure is equally questionable, given his own admission that traditional cartographers preserve neither statistical self similarity nor fractal dimensionality when generalising lines.

    Monmonier (1986) and Thapa (1988) proposed that the Douglas-Peucker algorithm could only be successfully applied when the reduction in scale was modest. Dettori and Falcidieno (1982), Deveau (1986) and Thapa (1988) have recently produced their own line simplification algorithms. Thapa moved away from the concept of retaining perceptually critical points in order to preserve the character of a line. He argued that the retention of all critical points resulted in spikes in the simplified line, and that whilst critical point detection is important, not all such points should be retained if the simplified line is to appear "smooth, uncluttered and aesthetically pleasing" (p 516). Thapa's algorithm therefore seeks to preserve the general shape of the line, rather than the critical points.

    Although proponents of alternative algorithms and other workers have expressed specific reservations, as discussed above, we are unaware of a systematic and detailed analysis of the merits and the deficiencies of the Douglas-Peucker method. This is necessary since no single line simplification algorithm can satisfy all requirements. The Douglas-Peucker algorithm has been deemed cartographically sound by some prominent cartographers, at least when used with their test data. It cannot therefore be dismissed without a systematic evaluation and demonstration of its properties using more complex test data from widely used spatial databases.

     

    3.  Observations and Discussion

    Weeding, Cleaning and Simplification

    The Douglas-Peucker algorithm was initially used as a 'weeding' algorithm to remove superfluous points from line data captured by stream digitising. Even at the input scale, a tolerance factor corresponding to half the width of the digitised line may be used to define a single tolerance band. As shown in Figure 1a, all points falling within this tolerance band may be discarded without loss of information. This is a reasonable proposition, and is based on earlier work by Lang (1969). This method of weeding also removes switchbacks on lines located within the tolerance band (Figure lb), but not spikes extending outside the band (Figure lc). The method is useful for eliminating much unwanted detail, including psychomotor errors (Jenks, 1981), but is insufficient for cleaning digitised data. Whereas spikes extending outside the tolerance band are retained, other types of broader variations which occur within the band are eliminated when they could be recording bends in the original line (Figure ld). Here, the line is being simplified at its original scale. When displaying these weeded lines at progressively reduced scales, tolerance is derived as a function of map scale and device resolution, but the behaviour of the algorithm remains consistent.

     

     

     

    The algorithm is sensitive to the orientation of geometric features. It will retain spiky detail but will tend to over-generalise variations within the tolerance band as demonstrated by Van Horn (1985). Figure 2a depicts an unfiltered section of the coastline around Carmarthen Bay (Dyfed, Wales). Figure 2b depicts the same coastline as simplified by the Douglas-Peucker algorithm. It can be seen that the creeks on the northern coast of the peninsula (extreme right) are retained at this level of simplification. The problem of unbalanced simplifications becomes progressively worse with increasing values of tolerance. Figure 2c depicts the same coastline in grossly simplified form. Much of the coastline has been over generalised, but the creek area has still been preserved in some detail.

    The detailed character of the line can be maintained through weeding when scale reduction is modest. With progressive reduction, such detail becomes unsightly and impedes communication and must therefore be eliminated by abstraction. Figure 2d shows a manually generalised version of the the Same coastline. Data for Figures 2a and 2d were independently digitised by different agencies from different source maps. As Jenks (1981) observed, digitising is error prone, and is in itself a generalising process. These Figures are therefore included to illustrate the type, rather than the accuracy of simplification. Thus, McMaster's criterion of preservation of angularity and minimal vector and areal displacement are good measures for retaining the detailed character of the line, i.e. for weeding, but are poor measures of simplification for caricatural generalisation.

    Monmonier (1986), Thapa (1988) and other workers have suggested that the algorithm may be successfully applied when scale changes are modest, but that it is inappropriate when scale changes are great. Since this degradation of acceptability is not related to any variation in the inherent behaviour of the algorithm it must be related to changing objectives and expectations.

     

    Differing objectives of generalisation

    Jenks (1918) classified simplified representations into four categories based on three perceptual thresholds whose existence he verified by psychophysical testing. Applications, such as contour mapping, require simplified lines to appear very similar to the original, and are therefore subjected to relatively low levels of generalisation. Thematic maps can stand higher levels of simplification, but the maps must be recognisable, even if they are perceived as different from the original. Cartograms and newsmaps can be subjected to even further simplification as long as the maps remain recognisable. Cartograms (see Cuff and Mattson, 1982) often preserve only the topological relationships. Maps generalised beyond recognition should not be used. Figure 3 was produced from less than 1% of the original source data. The fact that this distinctive coastline is still recognisable may be construed by some to indicate that the Douglas-Peucker algorithm produces satisfactory results even at very high levels of simplification. However, recognition is partly based on a priori knowledge, context and expectations. The fact that the results are recognisable does not imply that they are cartographically sound.

    Jenks' classification of simplified representations suggests that the objectives, and not just the degree of simplification, are variable. McMaster's mathematical measures of simplification and Muller's measures of self similarity and fractal dimensionality may be applicable to the evaluation of weeding algorithms and Jenks' class of minimally simplified maps. Alternative measures have to be formulated for other classes of more abstract simplifications. Current applications of these statistical measures do not distinguish between inadvertent and deliberate departures from the original line.

    Strategies for coping with unbalanced simplifications

    Van Horn (1985) and Buttenfield (1986) proposed alternative strategies for correcting the unbalanced simplifications resulting from use of a single tolerance value with the Douglas-Peucker algorithm. Van Horn succeeded in retaining some detail along what had previously been over-generalised sections of coastline by pre-processing the points to reduce their precision. His strategy involved moving points to the nearest corner of a grid cell, the resolution of which was tied to that of the display device and the scale of display. However, this strategy would theoretically worsen, rather than improve, deeply fretted and dense sections of coastline. Buttenfield (1986) attempted to identify different types of geomorphic features using statistical measures derived from Ballard's (1981) strip tree representation of lines. Ballard's work in turn was influenced by Peuker's (1975) method. She concluded that it was possible to recognise geometrically different types of lines, but that it was not possible to identify geomorphic features with any degree of certainty. She suggested that the segmentation of a line into its constituent geometric types could facilitate the use of a set of suitable tolerance values to achieve more balanced simplifications at given scales. In our opinion, it would also be necessary to identify complex lines which could not be simplified satisfactorily by use of the Douglas-Peucker algorithm. Figure 2c demonstrates that the mere increase of tolerance values is insufficient to produce a satisfactory generalisation of such complex lines.

     

    The concept of a fixed rank order of critical points

    Use of the Douglas-Peucker algorithm assumes that, both geometrically and perceptually, the most important points are those which have the highest offset values, and that the least important points are those which have the lowest offset values. Previous implementations of the Douglas-Peucker algorithm have not facilitated the evaluation of this proposition. Most have tended either to select or reject points using tolerance values to terminate the recursive subdivision of a line. The *GENERAL command in the GIMMS mapping package (Waugh and McCalden, 1983) is an improvement in that it allows the user to assign points to nominal, scale related classes. However, such classes are inadequate for the purposes of evaluation. Scope for detailed evaluation was provided by Wade's implementation (Whyatt and Wade, 1988), which associates with each vertex the perpendicular offset value which led to its selection. This overcomes one of the major criticisms of the algorithm, namely that it is computationally demanding at run time. With database machines, lines could be filtered or simplified at source, reducing retrieval time. Also, the recording of offset values with vertices encouraged the use of visualisation techniques in the evaluation of the algorithm as described below.

    Figure 4a graphically compares the offset values recorded with each vertex along the stretch pf coastline on Figure 4b. If graphical information systems of the type described by Visvalingam (1985) were available, then it would have been relatively easy to cross reference data in these two displays in order to clarify and refine concepts. Instead, cross referencing was done manually. The largest offsets values do not necessarily refer to the points which are perceptually most significant. Points which Attneave (1954) would have identified as critical in the Sunk Island region (points A and B on Figure 4b), have very low rankings compared to others, such as point 24 which occurs along a relatively smooth section of the Lincolnshire coast. We will return to this example later. Thus offset values are not reliable indicators for producing an unequivocal ranking of points with respect to perceptual significance.

    Although the relationship between scale and the number of points per feature is still not clear, lines tend to be represented by a smaller number of points at reduced scales (Topfer and Pillewizer, 1966). The Douglas-Peucker method assumes that the rank order of points does not change with scale. Points used for small scale representations are always a subset of those used in larger scale representations. This assumption is invalid in certain situations, for example, along gently curving lines. On such lines, manual generalisations may result in different points being selected at different scales. Figures 5a and 5b show 25 (30%) and 33 (40%) points selected manually from a set of 85 points. Compare these figures with algorithmic selection of the same number of points in Figures 5c and 5d respectively. Different points represent the stretches between A to A', B to B' and C to C'. Note also the consequences of retaining points P and 0 shown on Figure 5d. The retention of P results in the omission of the perceptually critical point, X, and the selection of Y instead. Although the later inclusion of neighbouring points reduces the information value of point 0, the algorithm is unable to drop this point. Figures 5e and 5f show the differences between Figures 5b and 5d. These figures indicate how the retention of points selected at higher levels of the hierarchy limit the scope for adjusting the shape of the simplified line. Thus, the concept of a fixed rank order of critical points is not universally applicable, and the Douglas-Peucker rank ordering of points does not necessarily coincide with human perception of significance.

     

     

    Robustness of the method

    Since the algorithm is sensitive to the orientation of features with respect to the anchor and floater lines, we decided to test its robustness. The coastline of Figure 4b was progressively subdivided into two, four and eight segments and the offset values of vertices in each subdivision were recorded. The offset values of vertices on subdivisions of the line are plotted against offsets on the original line in Figures 6a-c. Here again, a generalised facility for cross referencing and exploring the distribution of selected points and features on Figures 4 and 6 would have provided greater incentive for formulating and testing hypotheses about the generalising process.

    On Figure 6a-c, many points have identical values, and are unaffected by the segmentation of the line. The method is fairly robust in the sense that relatively few points change rank within the hierarchy. With the coastline of Humberside, only two, twelve and sixteen percent of the total 2226 points changed offset values when the line was subdivided into two, four and eight segments respectively. The stable points tend to be those with the lowest offsets. It appears that the algorithm rapidly locates the same extreme points and produces identical solutions from then on.
     

     

    Segmentation of the line has the greatest impact upon the initial points selected, i.e. those with higher offsets. Hence, at higher levels of simplification, the same section of coastline can appear somewhat different if the positions of the initial anchor and floater points are altered. These terminal points tend to coincide on islands and holes represented by closed loops. Smaller islands are usually eliminated using other rules (e.g. area), but the position of the anchor/floater point can affect the generalised shape of relatively narrow but long islands, as depicted in Figures 7a-d.

     

    The rank order of critical points is questionable. Also, the argument in favour of this algorithm based on the proposition that this "global holistic routine considers the line in its entirety while processing" (McMaster, 1987b, p95) appears to be misleading. The strip tree concept (Ballard, 1981) and Figure 6 indicate that lines may be manually split at critical points without consequence. However, arbitrary segmentation of lines can cause problems. For example, the Market Analysis division of CACI (personal communication) found in 1983 that the boundary files for administrative areas in Great Britain (digitised by the Department of Environment (DoE) and the Scottish Development Department (SDD) and distributed then by SIA Inc.) produced sliver polygons between the Scottish Regions and between Scotland and England. The Douglas-Peucker algorithm had been used with GIMMS to generalise the same bounding lines twice, independently. As depicted in Figure 8, the hypothetical national boundary line would have been generalised between nodes A and B in Scotland, and between nodes C and D in England. This resulted in two slightly different sets of points being selected to represent the same boundary, which created sliver polygons. Had the boundary been split at all significant nodes (namely A, B, C and D) this problem would not have arisen. Thus, differences in the selection of even a few initial points can cause problems when integrating weeded cartographic files.

     

    Distortion of the shape of features

    Although McMaster's conclusions commend the Douglas-Peucker algorithm as mathematically superior because it exhibits least areal displacement, visual observation shows that the algorithm distorts the shape of the coastline on his own, relatively simple data set (see Figure 9a). The original line consisted of 40 points. A subset of 16 points was selected by McMaster (1987b) using the Douglas-Peucker algorithm (see Figure 9b). Figure 9c depicts the line as simplified by the authors using a different procedure. This simplification, which also consists of 16 points, preserves more satisfactorily the general shape of the coastline.

     

    Williams' (1987) proposition that the Douglas-Peucker method produces good caricatures of lines because it retains maxima and minima is true only for simple lines. Because high offsets can be located on major or minor features, retention of points with high offsets can lead to some distortion of shapes as depicted in Figure 10a. The Sunk Island region has become truncated. The worst case of shape distortion results from the line crossing itself. Figure l0b depicts an example of this at the head of the Humber Estuary. This violates the most important rule in generalisation, namely that the logical and topological relationships must be preserved. It has been suggested that the clumsy appearance of Douglas-Peucker simplified lines could be enhanced by smoothing operators. However, such operators can correct neither unbalanced simplifications nor distortions in shape.

     

    The concept of critical points

    Advocates of the Douglas-Peucker method have not only conferred the term 'critical points' to the vertices selected by the method, but they have also endowed these points with the meaning which Attneave (1954) previously ascribed to perceptually significant points. A 45% overlap of points in common between manually and digitally simplified lines does not justify Whites conclusion that there was a high degree of correspondence. Instead, it indicates that there was a lack of consensus in over half the points selected by the algorithm. The designation of the term, "critical points", to two partially overlapping sets makes it ambiguous and confusing. The resulting confusion is reflected in Thapa's statement (1988, p516) that "some of the critical points which are likely to cause spikes in the generalised lines must be eliminated if the generalised lines are to be smooth, uncluttered and aesthetically pleasing", which implies that some critical points are not so critical after all.

    Points selected by the Douglas-Peucker algorithm are not always critical. Manual generalisations take into account the relative importance of features. This is partly dependent upon the purpose of the map. Even if we ignore such variable factors, manual generalisations tend to preserve the shape of geometrically larger features at the expense of smaller ones. Critical points on the coastline of Figure 11a are those which define the shape of the larger (more important) feature. The minor bay would be removed by a traditional cartographer in this example (see Figure 11b). Generalising the Same coastline using the Douglas-Peucker method would result in a simplification similar to that depicted in Figure 11c. The shape of the coastline has become distorted. The reason for this is that extreme points from the anchor/floater lines need not be located on points which define larger features, and in this instance, a point on the minor feature has been selected early on in the simplification process. Once selected, this point cannot be subsequently dropped. The resulting simplification is sub-optimal.

    4. Conclusions

    The mathematically elegant algorithm for line simplification described by Douglas and Peucker (1973) has been widely accepted and used within digital cartography. Advocates of alternative algorithms have started to express some anecdotal criticisms of the method in recent years. The mathematical evaluation of a number of competing algorithms has led McMaster to conclude that the Douglas‑Peucker method is mathematically and perceptually superior (McMaster, 1987b). This paper has described some of the observations made during a detailed visual investigation of the behaviour of the algorithm when applied to relatively complex data, and has also examined some of the implications of its use. Our observations may be summarised as follows:‑

    1. The importance of a point is dependent upon its position relative to the currently active anchor‑floater line. No account is taken of the relative importance of the geometric, let alone substantive, feature on which the point is located, nor of its location vis‑a‑vis its neighbours. 
       

    2. It is assumed that points may be ranked in an unequivocal manner. The possible impact of scale on the ranking of points is ignored. The ranking is based on the magnitude of offset values. The concept of such a scale‑independent ranking of points is questionable. 
       

    3. Although the method is sensitive to the orientation of the initial anchor‑floater line, i.e the spatial sample, it is relatively robust. Relatively few points identified at lower levels of the hierarchy are subject to a change in status. Once common extreme points are located, the ranking of points becomes consistent. 
       

    4. Scale‑related line simplifications are produced by altering the magnitude of a tolerance factor, which is used as a filter. Only vertices with offset values in excess of the tolerance factor are retained. 

      Several workers have observed that the use of a single tolerance factor results in an unbalanced generalisation. Van Horn (1985) and Buttenfield (1986) have suggested corrections to the method. These corrections can at best be regarded as fixes since the method, which is point rather than feature orientated, is sensitive to the orientation of features and can distort the shape of complex lines, especially those with spiky detail.
       
       

    5. Even critics of the method have suggested that the method performs well at modest levels of simplification or scale reduction. This incorrectly implies that the behaviour of the algorithm is different at modest compared with gross levels of simplification. It appears that the quality of acceptability is partly influenced by the limits of human perception. 

      As described by Jenks (1981), our expectations of the results of line simplification also depend upon anticipated uses, which may or may not be dependent upon scale. The Douglas‑Peucker algorithm is only capable of varying the degree of one type of simplification. With increasing simplification, the inherent weaknesses of the method become more visible. These weaknesses suggest that the
      method is not even optimal for cleaning data during weeding. Since the behaviour of the algorithm is invariant, the method cannot produce the type of caricatural generalisation which relies upon the elimination of features, which typify highly generalised lines. 

      Generalised, as opposed to weeded lines, tend to omit completely certain types of features, especially minor ones. This inevitably leads to some displacement of the simplified line relative to the original. Consequently, the high performance of the Douglas‑Peucker algorithm
      on mathematical evaluations (as described by McMaster) may be interpreted as being indicative of its relative merits as a weeding algorithm, but not necessarily as evidence of its superiority as a generalising algorithm.   

     
    1. It could be argued that despite its weaknesses, the algorithm produces recognisable shapes. However, given a line with distinctive character, even computationally cheap algorithms can yield recognisable shapes. Figures 12a and 12b depict the coastline of Humberside as simplified by the Douglas‑Peucker algorithm and the n'th point algorithm. The selection of every n'th point is known to be extremely simplistic. Yet, at this level of simplification (10% points retained) it produces an easily recognisable coast. It is only at grosser levels of simplification that significant differences hit the eye. We conclude from this that the ability to recognise a profile is an inadequate measure of the success of a line simplification algorithm. In our evaluations, we did not rely on passive visual assessment of the algorithm's product. Instead, alternative visualizations of the same data were contrived and cross­referenced to pursue hypotheses and draw conclusions about the process, its underlying assumptions and their implications. 
       

    2. Distinctive profiles do not test the skill of the caricaturist. A great deal of research output in the field of line simplification is questionable owing to the simplistic nature of the test data used. A large volume of line data is now available in digital form. In Britain, the DoE/SDD captured boundaries of the hierarchy of administration areas is now in the public domain (Whyatt and Wade, 1988). Also a much greater variety of topographic data, captured by the Ordnance Survey of Great Britain, is available for research (Visvalingam and Kirby, 1987). One of the main aims of our research is to identify and publicise a more exacting set of test data for use in evaluative studies.
       

    3. Several workers, for example Edwards (1975) and Irvine (1979) have pointed out that objective statistical measures are not necessarily superior, and may indeed be very misleading. Visvalingam and Kirby (1984) and Visvalingam (1985) illustrated the much greater scope for concept refinement through visualisation. This paper has further illustrated the role of visualisation, here in the evaluations of an algorithm.

    The Douglas‑Peucker algorithm is just one of several algorithms being evaluated by the Cartographic Information Systems Research Group. It is pertinent to note that similar ideas based on local maxima (peaks) and minima (pits) were used by Peucker and Douglas (1975) to derive a generalised primal sketch of terrain. Their work is still cited and used in the field of terrain modelling (see for example the paper by Falcidieno and Pienovi, 1990). Thus, the observations made in this paper are equally relevant to scale‑related visualization of 3D data.


    Acknowledgments

    Completion of this work was assisted by the award of an SERC Quota Studentship to Duncan Whyatt. We are grateful for this. We also wish to record our thanks to Drs. Graham Kirby and Mike Turner of the Cartographic Information Systems Research Group (CISRG) of the University of Hull for their comments on the paper, to Phil Wade also of the CISRG for his implementation of the Douglas-Peucker algorithm, to Ian Jenkinson for his preliminary work on the evaluation of the algorithm as part of his Final Year project in the Department of Computer Science, to Steve Wise of the South West Universities Regional Computer centre (SWURCC) for Information providing access to the DoE/SDD boundary files at SWURCC, to the Ordnance Survey for access to 1:625 000 digital data and to Robert McMaster for permission to use an adapted version of one of his Figures in our paper.

    References

    Attneave, F., 1954, 'Some informational aspects of visual perception', Psychological Review 61 (3), 183-193.

    Ballard, D.H., 1981, 'Strip Trees: A Hierarchical Representation for Curves', Communications of the Association for Computer Nachinery 24 (5), 310-321.

    Buttenfield, B.P., 1986, 'Digital Definitions of Scale Dependent Line Structure', in M. Blakemore (ed) Proc. Auto Carto London (ICA, 1986), 497-506.

    Cuff, D.J. and Mattson, M.T., 1982, Thematic Maps: Their Design and Production, (Menthen, London), 50-52.

    Dettori, G. and Falcidieno, B., 1982, 'An Algorithm for Selecting Main Points on a Line', Computers and Geosciences 8, 3-10.

    Deveau, T. J., 1985, 'Reducing the number of points in a plance curve representation', Proc. Auto Carto 7, Washington, D.C., 152-160.

    Douglas, D.H. and Peucker, T.K., 1973, 'Algorithms for the reduction of the number of points required to represent a digitised line or its caricature', The Canadian Cartographer 10 (2), 112-122.

    Edwards, J., 1975, 'Social Indicators, Urban Deprivation and Positive Discrimination', J. Social Policy 4, 275-287.

    Falcidieno, B. and Pienovi, C. (1990) 'Natural surface approximation by constrained stochastic interpolation', Computer Aided Design, March 1990, in press.

    Herr, L. and Zaritsky, R., 1988, Visualisation : State of the Art, ACM/SIGGRAPH Video Review (30).

    International Cartographic Association., 1973, Multilingual Dictionary of Technical Terms in Cartography (Steiner, Wiesbaden).

    Irvine, J, Miles, I, and Evans, J., 1979, De-mystifying Social Indicators (Pluto, London).

    Jenks, G.F., 1981, 'Lines, Computers, and Human Frailties', Annals of the Association of American Geographers 71 (1), 1-10.

    Jones, C.B. and Abraham, I.M., 1987, 'Line Generalisation in a Global Cartographic Database', Cartographica 24 (3), 32-45.

    Lang, T., 1969, 'Rules for the robot draughtsmen', Geographical Magazine 42 (1), 50-51.

    MAPICS Limited, undated, MAPICS Limited, 1 Balfour Place, London W1Y 5RH

    Marino, J.S., 1979, 'Identification of Characteristic Points Along Naturally Occurring Lines: An Empirical Study', Canadian Cartographer 16 (1), 70-80.

    McCormick, B.H., DeFanti, T.A. and Brown, M.D. (eds., 1987) 'Visualization in Scientific Computing', Computer Graphics 21 (6).

    MeMaster, R.B., 1983, 'A Mathematical Evaluation of Simplification Algorithms', in B.S.Wellar (ed) Proc. Auto Carto Six, 267-276.

    McMaster, R.B., 1986, 'A Statistical Analysis of Mathematical Measures for Linear Simplification', American Cartographer 13 (2) 103-117.

    McMaster, R.B., 1987a, 'The Geometric Properties of Numerical Generalization', Geographical Analysis 19 (4), 330-346.

    McMaster, R.B., 1987b, 'Automated Line Generalization', Cartographica 24 (2), 74-111.

    Monmonier, M.S., 1986, 'Towards a Practicable Model of Cartographic Generalization', in M.Blakemore (ed) Proc. Auto Carto London (ICA, 1986) 257-266.

    Morrison, J.L., 1975, 'Map Generalization: theory, practice and economics', in Proc. Auto Carto 2 99-112.

    Muller, J.C., 1987a, 'Optimum Point Density and Compaction Rates for the Representation of Geographic Lines', in N.R Chrisman (ed) Proc. Auto Carto 8, 221-230.

    Muller, J.C., 1987b, 'Fractal and Automated Line Generalization', Cartographic Journal 24, 27-34.

    Peucker, T. K., 1975, 'A theory of the cartographic line', Proc. Auto carto II, Reston, Virginia, 508-518.

    Peucker, T. K. and Douglas, D. H. (1975) 'Detection of surface specific points by local parallel processing of discrete terrain elevation data', Computer Graphics and Image Processing 4, 375-387.

    Raper, J. and Green, N., 1989, 'GIST! The Geographical Information Systems Tutor', Burisa 89 (July) 16.

    Robinson, A.H , Sale, R.D and Morrison, J.L., 1984, Elements of Cartography, (Wiley, NY), Chapter 8.

    Thapa, K., 1988, 'Automatic Line Generalization using Zero Crossings', Photogrammetric Engineering and Remote Sensing 54, (4) 511-517.

    Sasada, T.T., 1987, 'Drawing natural scenery by computer graphics', Computer Aided Design 19 (4), 212-218.

    Topfer, F and Pillewizer, W., 1966, 'The Principles of Selection', Cartographic Journal 3, 10-16.

    Van Horn, E.K., 1985, 'Generalizing Cartographic Data Bases', in Proc. Auto Carto 7, 532-540.

    Visvalingam, M., 1985, 'Concept refinement in social planning through the graphic representation of large data sets', in Shackel B (ed.) Interact '84, Proc. of the First IFIP Conference on Human Computer Interaction, (North Holland, Amsterdam), 281-286.

    Visvalingam, M., 1990, 'Trends and Concerns in Digital Cartography', Computer Aided Design, March 1990, in press.

    Visvalingam, M. and Kirby, G.H., 1984, 'Th impact of advances in IT on the cartographic interface in social planning', Department of Geography Miscellaneous Series No 27, University of Hull.

    Visvalingam, M. and Kirby, G.H., 1987, 'Directory of Research and Development based upon Ordnance Survey Small Scales Digital Data', Cartographic Information Systems Research Group, University of Hull.

    Waugh, T.C. and McCalden, J., 1983, GIMMS Reference Manual (4.5), Gimms Ltd, 30 Keir Street, Edinburgh EH3 9E4.

    White, E.R., 1983, 'Perceptual Evaluation of Line Generalisation Algorithms', Unpublished Masters Thesis, University of Oklahoma.

    Whyatt, J.D. and Wade, P.R., 1988, 'The Douglas-Peucker Line Simplification Algorithm', Society of University Cartographers Bulletin 22 (1), 17-25.

    Williams, R., 1987, 'Preserving the Area of Regions', Computer Graphics Forum 6, 43-48.

     


    © Dr Mahes Visvalingam, University of Hull, Uploaded on June 2003

    Cartographic Information Systems Research Group, University of Hull