|
5.
AN APPLICATION OF DAM
This section briefly describes how DAM was used in a project we undertook
to derive hierarchic area objects from an Ordnance Survey (OS) database of
feature coded vectors.
5.1 Background to the OS 1:625,000 Data
The 1:625,000 database was established by the Ordnance Survey for purposes
of experimentation by themselves and others. The aim was to provide
positive evidence towards the design of the 1:50,000 database (Haywood,
1984). The structure of the database was recognised as a crucial factor
influencing not only the usefulness of a small scale database but also its
feasibility. The structure has implications for the cost of initial data
capture and subsequent maintenance of the database.
It should be emphasised that the data we received was transferred from a
database that the OS did not regard as final in terms of design nor
complete as far as the data content was concerned. A systematic search for
errors and omissions had not been carried out; these were known to exist,
but their extent was completely unknown.
The OS database design considered the cost effective capture of data from
maps to be one of the most important influences. We undertook to evaluate
whether the OS design for representing the hierarchy of administrative
areas (districts, counties, and countries) was adequate in concept,
structure, and content for other purposes. The OS database does not
provide an explicit description of the boundaries of the areas, nor does
it give the hierarchical and spatial relationships between them (for
example which districts belong to a county, or which counties are adjacent
to one another). Instead, it just holds the essential topographic details
of area objects using 'links' and 'area seeds'.
Each link is a line (described by an ordered list of Cartesian
coordinates) with an associated feature code which indicates the type of
administrative boundary it represents. Only one link exists for each part
of a boundary, and the feature code corresponds to the highest level of
administrative area to which the boundary belongs. Since the
administrative units form a hierarchy, it could be inferred that the
boundaries of high level objects also form a boundary of objects below
them in the hierarchy and that the coastline could form all other
boundaries. Each
detached part of an administrative unit is indicated by an area seed, a
representative point within the polygon enclosing that part of the object.
This polygon is similar to a GEOGRAF polygon. The area seeds carry the
name of the area, and also have an associated feature code to identify the
type of area to which they relate. For the administrative areas there are
codes for national, county, metropolitan county, and district, and also a
further set of codes for the seaward extensions of the administrative
areas.
This pragmatic data model is extremely convenient and cost effective for
data capture since it records once, and only once, each explicitly
recorded map detail. However, a consequence is that the extraction of area
objects is quite complex.
5.2 Extraction of the Geometric Topology
Our first task was to extract the links and area seeds relating to the
administrative areas from the remainder of the OS data. The links were
stored in a POLYVRT type structure this conversion was straightforward, as
the links for the administrative boundaries do not cross one another and
only meet at their end points. This stage was only concerned with the
geometry, or course, of the links; however, the link feature codes and the
area seeds were also stored for later use.
The OS data is divided into units of 100 x 100 km. squares, based upon the
National Grid. These units are referred to as 100 km. libraries. We were
supplied with four such libraries which together made a 200 km. square
whose south west corner had the grid reference SN 000000. The four
libraries were amalgamated into a single data base to ease subsequent
processing. However, areas along the edge of the 200 km. square remained
cut and their links remained as dangling lines. Extra links which
corresponded to the edge of the 200 km. square were therefore inserted
into the database so that the boundaries of the incomplete areas became
closed.
The last part of the geometric processing was to form the boundaries of
the primitive regions and calculate their containment hierarchy, as
described in Sections 4.1 and 4.2 respectively.
5.3 Identification of Area Objects
In the OS scheme, the construction of an area object begins with the
retrieval of all its area seeds. From each area seed a spatial search
takes place according to a set of rules to find a node, that is, a link
end point, which lies on the outer boundary of that area. Links are then
chained from this starting node, according to further rules, to form a
complete polygon. This process identifies the outer boundary of each
disjoint part of the.area object. Further checks and processing are
required to establish whether there are any holes within the area, and if
so to determine their boundaries.
In our scheme, the aim is to classify each unit of space (that is,
primitive region), rather than identify the linework of the area
boundaries. This involves finding the name and type of district, county,
and country of each primitive region. By focusing on the primitive region
it is possible to provide a more coherent representation of area objects.
For example, the representation of the containment relationships between
primitive regions makes it unnecessary for repeated spatial searches to
establish whether area objects contain holes. Also, the knowledge of the
spatial relationships between the primitive regions enables checks to be
carried out to ensure that the relationships between area objects are
consistent.
Figure 6 illustrates the techniques that were used to name the primitive
regions. The various line styles indicate the feature codes that were
supplied with the links. Firstly, each area seed was allocated to a
primitive region by using a point in polygon test to find the primitive
region containing it, and its associated attributes (the type of the area
seed and the name of the administrative unit) were assigned to the
primitive region. Figure 6a has the area seeds marked upon it.
The next process was to search outwards and inwards from these named
primitive regions to find neighbouring primitive regions which belong to
the same object, that is, a named administrative unit.
The knowledge of the spatial relationships between the primitive regions
made this a thorough and efficient process. Each search started from a
seeded primitive region. If the feature code of a link which bounds the
primitive region was below the level of the seed, the name was also given
to the adjoining primitive region if it was still unnamed. The extent of
this search was limited by a link at an equal or higher level than the
seed. Figure 6b illustrates this process.
Following this there were
still some primitive regions which were not named at all levels. This
arises because:
1) The area‑seeds for
some area objects, which were cut by the map edge, occurred outside the
map edge and were thus missing from the subset we received. This type of
omission could not be overcome and so these areas could not be named. The
existence of these unnamed objects was established however.
2) Some omissions in
the OS data were intentional. Islands do not contain national area‑seeds,
and seaward extensions do not contain national and sometimes county
area‑seeds. It was possible to deduce the names in such cases from other
information. For an island, its nationality was taken from a land
counterpart which had the same district or county name. If missing, the
nationality and county of a seaward extension was similarly taken from a
land counterpart with the same district name.
There was one further
set of unnamed primitive regions which arose from the system of
identifying each unit of space: those which represent the sea. It was
possible to name these primitive regions as such by starting from any
one primitive region whose land/sea status was
known and expanding outwards and inwards. The land/sea state changes of
course every time a link with the coastline feature code is crossed.
Figure 6c shows the
naming of the primitive regions after all the above processing. There is
one primitive region whose county is unknown and three whose district is
unknown due to these areas being truncated by the map edge. There is
also one on the left edge of the map whose nationality has not been
supplied and cannot be deduced. It is known to be land, however, due to
the wider context made explicit by the processing.
It was then possible to
build the hierarchy of administrative areas within the restrictions
caused by the lack of data. This involved:
1) the allocation of districts to counties, and counties to countries;
and
2) the extraction of the boundaries of the areas by aggregating their
primitive regions and eliminating internal boundaries.
The area objects do not
yet exist as such within the system; instead, they are held as
attributes of the primitive regions. However, there is the knowledge
within the system for the extraction of the entire hierarchy of area
objects in a variety of forms, including the formats required for
transfer to other systems (such as GIMMS and DLG‑3).
The use of an
intermediate data structure, based on DAM, which focussed on the
geometric topology of the primitive regions rather than on the topology
of the links, made it possible to validate partially the OS data, for
example:
1) that the disjoint
parts of area objects at each level of the administrative hierarchy each
contain just one area‑seed for that level;
2) that each area
object is completely bounded by links with the appropriate feature
codes;
3) that the
administrative areas form a proper hierarchy;
4) that the type of
each area is appropriate to its status as land or sea; and
5) that all areas are
named, except those on the edge of the map.
We found just one
coding error; the links which form the northern seaward extension of the
England/Wales boundary had been feature coded as a county seaward
extension. This violated the rule that the feature code of a boundary
should correspond to its highest level in the administrative hierarchy.
5.4 Summary of the
Project
The project confirmed
that it is possible to derive transformations of the OS 1:625,000
experimental data for manipulating hierarchically related objects and
their polygonal descriptions. The object hierarchy had to be inferred
from fragments of information, using the scattered clues and the nesting
rules for the hierarchic link and area‑seed feature codes. The concept
of the primitive region, and the knowledge of the geometric topology,
provided a convenient framework for solving the puzzle. Further details
of this project can be found in Visvalingam et al (1985).
A Geographic
Information System should be capable of deriving a complete description
of hierarchically related and/or overlapping area objects from
feature‑coded map details, that is, fragments of information. In theory,
ARC/INFO appears to be capable of this. However, the verification of the
resulting information base would involve the re‑formulation of
relatively simple rules in terms of possibly complex relational
operators. Moreover, in some cases such
verification may be impossible if the user does not have access to all
information, especially information supplied by the user (for example,
multiple polygon labels within one polygon).
<Conclusion> |