From Line Geometry To Area Topology
by  © P. Wade, M. Visvalingam and G.H. Kirby
CISRG Discussion Paper Series No 1, University of Hull, 48 pp

 

For the sake of posterity, please cite from the published version:
Kirby, G H, Visvalingam, M and Wade, P (1989) "The recognition and Representation of Polygons with Holes" Computer J, 32 (6), 554 - 562.

CONTENTS

  1. Introduction

  2. Background

  3. Disassociative Area Model (DAM)
    3.1 Geometry
    3.2 Geography
    3.3 Comparison of DAM with predecessors

  4. Deriving the Topology
    4.1 Extracting the Boundaries
    4.2 Forming the Containment Hierarchy

  5. An application of DAM
    5.1 Background to the OS 1:625,000 Data
    5.2 Extraction of the Geometric Topology
    5.3 Identification of Area Objects
    5 .4 Summary of the Project

  6. Conclusion

REFERENCES
FIGURES
It is best to open the figures in another browser window, so that you can read the text while studying the figures

4. Deriving the Topology

This section describes algorithms for deriving the topology of a set of primitive regions from the geometry of a set of node matched links. This operation has two distinct stages: first, a set of boundaries are extracted from the links; then the containment hierarchy between the boundaries is computed. The data structure used during these operations is also given.

4.1 Extracting the Boundaries

This process connects together a set of links to form a set of closed boundaries. The process assumes that the links have been topologically structured into a link and node structure; if necessary, the data must be preprocessed to convert it to this form.

The links are encoded using the link, node, and coordinate records shown in Figure S. These records are similar to the structure used by the POLYVRT system. One link record is used for holding the details of each link. The coordinates of the two end points of the link are stored in node records, whose record numbers are kept in the link record. The internal points of the link (if any) are stored separately in a contiguous block of records in the coordinates file. The record numbers of the first and last coordinates are again kept in the links record. There need be no restriction on the number of internal points in a link, although it may be desirable to set an upper limit for implementation purposes, in which case long links can be split into two or more. The envelope (that is enclosing rectangle) and area underneath a link are calculated and held in the link record, as these values are each required several times later. The area is obtained by summing the area of the trapezium underneath each consecutive pair of points in the link using a standard formula (see Figure 5) this yields a negative area for some links, which is used to advantage later. For example, this formula would yield a negative area for the example link in Figure 5; however, if the link was stored in the opposite direction (that is x5Y5 is x1y1 and so on) then the area obtained would have the same magnitude but be positive.

There is one node record for each location where one or more links start and end. With each node is kept a list of the record numbers of the links which meet there. The record number is signed to indicate whether the link starts (positive) or ends (negative) at the node. The bearing at which each link leaves or enters the node is also held, and the lists are sorted on this field before the boundaries are formed so that the links are referred to in a clockwise order around the node.

Once all the links have been added to a database, the process of boundary formation can take place. One boundary record is used to hold the details of each boundary. An ordered list of the links which describe each boundary is formed, and the envelope and area enclosed are calculated. The algorithm used to form each boundary works by initially selecting one link, and then taking a succession of leftmost links radiating from the previously selected link until the beginning of the first link is reached again. This is now the preferred method used by several systems, as it only uses the geometric information about the links. Many earlier systems (such as POLYVRT and HDS) used information about the area objects on the left and right of the links when chaining the boundaries. Such a method is not suitable for systems which have moved away from the input of links together with left/right area codes (as in GIRAS and ARC/INFO).

The method for forming the boundaries is given by the following pseudo code. The names of variables are given in upper case, and references to the database have the form 'RECORD_NAME [RECORD_NUMBER].FIELD_NAME', for example 'LINKS [I].START_NODE' refers to the value held in the 'start node' field of the ith 'links' record. Comments to explain the logic are enclosed by curly brackets.
 


Algorithm to form the set of boundaries of a set of links

let BOUND = 0 {a counter for the number of boundaries formed}
for each LINK
      if LINKS [LINK].LEFT_BOUNDARY is still unknown
      then {form the boundary on the left of this link}
                  let BOUND = BOUND + 1
                  call FORM_BOUNDARY (BOUND, LINK)
      end if
     if LINKS [LINK].RIGHT_BOUNDARY is still unknown
   
 then {form the boundary on the right of this link}
                   let BOUND = BOUND + 1
                  call FORM_BOUNDARY (BOUND, ‑LINK)
     end if
end for
{BOUND now holds the number of boundaries formed}
 


Procedure to form the boundary on one side of a specified link

procedure FORM_BOUNDARY (BOUND, START_LINK)
 let LINK = START_LINK
 let BOUNDARIES [BOUND].ENVELOPE = LINKS [abs (LINK)].ENVELOPE
 let BOUNDARIES [BOUND].AREA_ENCLOSED = 0.0
 repeat {chain round the links which describe this boundary}
    if LINK > 0
   then {the boundary on the left of the link is being formed, so traverse
           the link in its stored direction, that is from its start node to  its end node}
         let LINKS [LINK].LEFT_BOUNDARY = BOUND
         let BOUNDARIES [BOUND].AREA ENCLOSED =
             BOUNDARIES [BOUND].AREA_ENCLOSED + LINKS [LINK].AREA UNDERNEATH
         let DEST_NODE = LINKS [LINK].END_NODE
   else {the boundary on the right of the link is being formed,
          so traverse the link in reverse direction,
          that is from its end node to its start node}
        let LINKS [‑LINK].RIGHT_BOUNDARY = BOUND
        let BOUNDARIES [BOUND].AREA -ENCLOSED =
           BOUNDARIES [BOUND].AREA_ENCLOSED ‑ LINKS [‑LINK].AREA_UNDERNEATH
        let DEST_NODE = LINKS [‑LINK].START_NODE

   end if

             let BOUNDARIES [BOUND].ENVELOPE = the maximum extents of the union of
                                                              BOUNDARIES [BOUND].ENVELOPE and
LINKS [abs (LINK)].ENVELOPE

   append LINK to the list of links of BOUND
            {find the entry in the list of links of DEST_NODE for LINK entering the
node
             (or leaving it if LINK is being followed in reverse direction)}

   let POINTER = NODES [DEST_NODE].LIST OF LINKS
            while CONNECTED_LINKS [POINTER].LINK <> ‑LINK

                 let POINTER = CONNECTED_LINKS [POINTER].NEXT

            end while

            {assign the next link in the list of links of DEST_NODE to LINK. This is the next link to leave or enter DEST_NODE when moving round the node in a clockwise direction, and will be the next link followed to continue the formation of the boundary}

            if CONNECTED_LINKS [POINTER].NEXT = nil
then {the entry for ‑LINK is the last in the list of links of
DEST_NODE, so loop round to the first entry in the list}
let LINK = CONNECTED_LINKS [NODES [DEST_NODE].LIST_OF LINKS].LINK

           else
let LINK = CONNECTED_LINKS [CONNECTED_LINKS [POINTER].NEXT].LINK

           end if

        until LINK = START_LINK

    end procedure


At the end of this process each link is part of two boundaries, where the boundary on its left has the same direction as the link, and the boundary on its right has the reverse direction. As a result of the method used for chaining the boundaries, it is automatic that enclosing boundaries have an anti clockwise direction and holes have a clockwise direction. A further consequence is that enclosing boundaries have a positive area and holes have a negative area. This is the area enclosed by the boundary, and not that of the corresponding primitive region. The boundary does not have a separate identity in the ARC/INFO and GIRAS data structures, and so they only record the area of the primitive region. Note that the area of a primitive region can be derived by summing the areas of all its boundaries in DAM, whereas the converse (obtaining the area enclosed by a boundary) is not possible in ARC/INFO and GIRAS.

Each boundary has a separate existence within our data structure. The adjacencies between boundaries are given by the link records. However, where a primitive region contains holes, there is as yet no connection between the
enclosing boundary of the primitive region and the boundaries which describe its holes. For example, there would be no connection between enclosing boundary 6 and holes 7 and 9 in Figure lb. The set of holes belonging to the outermost primitive region have also not been identified. These relationships need to be resolved, and an algorithm for computing this will be described next.

4.2 Forming the Containment Hierarchy

As mentioned in Section 3.1, forming the containment hierarchy of a set of boundaries fully resolves the relationships between the boundaries, and thus the topology of the primitive regions. This hierarchy can be represented by a rooted tree, and the derivation of this tree for a set of boundaries is a one­off process. Edwards et al (1977, pp 20 ‑ 22) describe the algorithm used in their HDS system; it utilises information about the area objects, and lacks the efficiency of the method given below. Information about area objects (the position of the polygon label) is also used in the GIRAS system in order to determine the outer and inner extents of a primitive region (Mitchell et al, 1977, p 11). However, the description of the algorithm used is not sufficiently detailed to allow comments on its efficiency. In ARC/INFO, no geographic information is used in resolving the primitive regions, as the supply of polygon labels by the user is optional. The ARC/INFO Users Manual (ESRI, 1985), however, does not explain the methods used.

A novel algorithm for forming the containment hierarchy between a set of boundaries follows. It does not require any geographical information, and it attempts to minimise the computation required. During the process, the first two fields of the boundary records are completed (Figure 5); that is the level of each boundary in the tree and also which boundary immediately precedes each boundary in the tree are ascertained. This information gives a complete description of the rooted tree.

The first part of this process is to find the immediate descendants of each hole in the tree:


Algorithm to discover the enclosing boundaries which immediately follow each hole

for each HOLE {that is, those boundaries with a negative area}
push HOLE onto an empty stack
repeat
              pop an item off the stack into BOUND
              {loop for each link in the list of links of BOUND}
              let POINTER = BOUNDARIES [BOUND].LIST_OF_LINKS
              while POINTER <> nil
                         let LINK = BOUNDING_LINKS [POINTER].LINK
                         if LINK > 0
      
                  then {BOUND is on the left side of LINK}
                              let ADJACENT_BOUND = LINKS [LINK].RIGHT_BOUNDARY
                         else {BOUND is on the right side of LINK}
                              let ADJACENT_BOUND = LINKS [‑LINK].LEFT_BOUNDARY
                         end if

                        if ADJACENT BOUND <> HOLE and BOUNDARIES
               [ADJACENT_BOUND].SURROUNDING_BOUNDARY is still unknown
               then
                    push ADJACENT_BOUND onto the stack
                    let BOUNDARIES [ADJACENT_BOUND].SURROUNDING_BOUNDARY = HOLE
               end if
               let POINTER = BOUNDING_LINKS [POINTER].NEXT
    end while

until the stack is empty

end for


Applying this to Figure lb, when following the links of hole 1, enclosing boundaries 2 and 3 will be identified as being surrounded by hole 1, and they will be pushed onto the stack. When following the links of the next boundary taken from the stack (either 2 or 3), enclosing boundary 4 will also be identified as being surrounded by hole 1, and it too will be pushed onto the stack. The stack will then be emptied by the next iterations as there are no further immediate descendants of hole 1. Note that the algorithm identifies all immediate descendants of a hole, even if they do not share any common links with it (for example enclosing boundary 4 was correctly identified as following hole 1).

The same process would take place for holes 5 and 7, thus identifying the surrounding hole for enclosing boundaries 6 and 8 respectively. Hole 9 would also be found to surround enclosing boundaries 10 and 11.

The second part in the process is to find the immediate descendants of each enclosing boundary in the tree:


Algorithm to discover the holes which immediately follow each enclosing boundary

for each HOLE
     initialise an empty list for the possible surrounding boundaries of HOLE
     for each ENCLOSING_BOUNDary
     {perform an envelope and area test}
          if BOUNDARIES [HOLE].ENVELOPE is properly contained in BOUNDARIES [ENCLOSING_BOUND].ENVELOPE and
                 ‑(BOUNDARIES [HOLE].AREA ENCLOSED) < BOUNDARIES [ENCLOSING_BOUND].AREA ENCLOSED then
                 add ENCLOSING‑BOUND to the list of possible surrounding boundaries of HOLE
          end if
     end for

     let FOUND = false
     obtain the coordinates of an arbitrary point on the polygon of HOLE
     while the list is not empty and not FOUND
         remove the ENCLOSING_BOUNDary with the smallest area from the list
         if the point from the polygon of HOLE is enclosed by the polygon of ENCLOSING_BOUND
           {perform a point‑in‑polygon test}
         then {HOLE is an immediate descendant of ENCLOSING_BOUND in the tree}
   
        let FOUND = true
   
        let BOUNDARIES [HOLE].SURROUNDING_BOUNDARY = ENCLOSING_BOUND
         end if
     end while

     if not FOUND
     then {there is no boundary which surrounds HOLE, and thus the hole is at level 1 of the tree}
         let BOUNDARIES [HOLE].SURROUNDING_BOUNDARY = 0 {a rogue value}
         let BOUNDARIES [HOLE].BOUNDARY_LEVEL = 1
     end if

end for


Returning to Figure 1b, no enclosing boundaries will be found which surround holes 1 and 5, so a surrounding boundary of 0 is recorded. In addition, their level will be recorded as 1. The envelope and area test will suggest that hole 7 immediately follows enclosing boundary 6 in the tree, and this will be confirmed by the point‑in‑polygon test. Similarly hole 9 will be connected to enclosing boundary 6.

The final part of the process is to determine the level of each boundary. This is straightforward ‑ the holes at level 1 are known, and so the enclosing boundaries which are surrounded by these holes are found and recorded as being at level 2. The holes surrounded by these enclosing boundaries can then be found and recorded as being at level 3, and so on, and this continues down the hierarchy. This completes the formation of the tree.

<An application>


© Dr Mahes Visvalingam, University of Hull, March 2003

Cartographic Information Systems Research Group, University of Hull