EGIS (1994), copyright EGIS Foundation.


A CONCEPTUAL MODELLING FOR THE GIS DEVELOPING

Stefano Nativi
PIN Centro Studi Ingegneria
P.za Ciardi, 25 Prato 50047, Italy
Giorgio Federici
Dipartimento di Ingegneria Civile, Universitˆ di Firenze
Via S. Martha, 3 Firenze 50100, Italy

ABSTRACT

The goal of the present work is to provide a conceptual modelling of the physical domain related to a general environmental GIS application. The proposed model has a general validity and is not constrained to any particular implementation. This model may be a first step for fixing a general conceptual definition of GIS. It may be useful even in connection with the semantic aspects related to the heterogeneity of the geo-information. Furthermore this model can provide a global scheme of the territorial information according to which heterogeneous GISs can be compared and integrated. This work consists of three main parts. The first part focuses on conceptual modelling and knowledge representation. In particular this part reports the different types of knowledge that a GIS must deal with. The second part is concerned with formalism used according to the geo-information conceptual model features. These concepts and features are the structural and constraint rules of a scientific conceptual-modelling language for the development of information systems. This language has been used for the formalization of a proposed conceptual modelling on an environmental information system. This language allows us to represent reality easily by means of entities and activities. Furthermore this knowledge representation treats attributes and objects at the same level, extending the use of the semantic abstraction. The third and conclusive part presents the formalization of the general environmental application. The subject world of the system considers many heterogeneous data. Some examples extracted from the system are reported. In particular the possible semantic abstractions are presented.

INTRODUCTION

In recent years, GIS applications have been part of a great growth, thanks to the better performances of telematic and computer information sciences.

A certain dispersion, connected to the applicative development, has obstructed a complete theoretical approach to GIS, thus penalising it (Molenaar, 1991).

The Applicative fields of GIS are very heterogeneous and fragmented, hence it is difficult to confront and to compare the different information systems on the basis of a general criterium (Maguire, 1991).

In recent years GIS has been referred to as a new discipline based on branches of computer science such as Artificial Intelligence, Data Base theory, etc. (Molenaar, 1991), (Maguire, 1991).

The present work aims to provide a conceptual modelling of the physical and system domain related to a GIS. The proposed model is not constrained to any particular application or practical implementation. It may be useful for trying to fix a general conceptual definition of GIS and for providing a global schema of territorial information systems according to which heterogeneous GIS can be compared and integrated.

A CONCEPTUAL APPROACH FOR GISs

A systematic development of GISs needs some preliminary operations:

i. Fixing a general conceptual definition of GIS
ii. Defining the semantic of the geo-information
iii. Defining the structural aspect of the geo-information

In literature there are many theoretical definitions of a GIS; Cowen (Cowen, 1988) outlines at least four main approaches in order to define and characterise a GIS. It is possible to note that all of the definitions have a common denominator: the GISs are systems managing territorial information. It is possible to consider a GIS as a particular information system.

Assuming that it is now very important to provide a conceptual definition of a GIS; naturally the actual problem is to define an information system. It is possible to consider in literature many conceptual definitions of an information system; we have considered Wintraecken's definition (Wintraecken, 1990).

[End Page 899]


The geo-information is coded in data about the territory; these data are stored in Databases which follow a precise Data model The management, the understanding and the width of the query-space depend on the effectiveness through which the Data Model conceptualises the real world. GIS has used more recent data models (Peuquet, 1984), even if GIS literature about the use of such techniques, for the data analysis and modelling, is still insufficient (Healy, 1991).

In order to develop an information system which manages the geo-information with intelligence, it is necessary to provide a system conceptual model -the conceptual model was introduced by the ANSI SPARC (ANSI/X3/SPARC, 1975)- based on knowledge representation techniques (communication is essential in a GIS) and on the semantic modelling features (territorial data are heterogeneous numerous and correlated in a complex way).

Some of the advantages that a conceptual modelling presents, during the development of a GIS, are:

i. improving understanding of problems;
ii. providing a formal model by which it is easier to communicate;
iii. allowing easy upgrading and management of the system;
iv. allowing the reuse of the model for other systems;

Object-orientation is a good solution for the above questions; but the real problem is that a complete agreement about the object-orientation definition does not exist. Furthermore while the Object Oriented Analysis (OOA) is an easy and ready technique, the object-oriented databases in fact do not perform very well and have not been well tested; they are based on different data models and data management languages; it will be difficult to have a standardization for short periods (Gray et al, 1992).

For the issues mentioned above an actual possible solution is to utilise the power of the OOA based on the facilities that the semantic modelling and knowledge representation allow.

The GISs, such as all other information systems, have to respond to some functional requirements (what the system does and which information it considers, the link between the system and reality, how the system and the real world interact) as well as non-functional ones (performances, security, cost, etc.)

According to the functional requirements we should mention that the system has to consider and conceptualise at least three knowledge worlds (Mylopoulos, 1991): the Subject World, the System World and the Usage World. Developing a conceptual modelling of a GIS means conceptualising and formalising, at least, the first two worlds bound to the system functional requirements: the Subject World and the System World.

The Subject World of a GIS is tightly coupled with the system specific application but it represents physical reality by means of the geo-information. Hence it is very important and possible to conceptualise these characteristics of the reality domain that belong to the most GIS applications.

DATA MODEL

A Data Model can be considered as a particular conceptual modelling, in this case the subject is the data structure and its operations. A data model is composed of:

  M = G + O (Tsichrtzis and Lochovsky, 1982) where:

    O = elementary operations;
    G=G[s]+G[r], where:

      G[s] = structure and category schema rules;
      G[r] = deductive and constraint schema rules;

[End Page 900]


The classic data models are: hierarchical model, network model and relational model These models are unsuitable for the conceptual modelling of data related to complex reality (Gray et al, 1992), and the geo-information may be very complex.

Semantic Data Models

These models have the same aims as conceptual modelling. The semantic model is easier than a conceptual one because it introduces some assumptions -related to the hardware environment- on the physical implementation of the conceptual schema. The major contribution of a semantic model is the possibility for providing abstract features for modelling the information structures and rules of a schema. Some facilities of the semantic models are: entities, entity types, hierarchy types, aggregation, attributes and their domains, relations, temporal domain. deductive and constraint rules.

Object Oriented Analysis

The semantic models and the object oriented ones share many notions, in fact the semantic models may be considered object-oriented structured. The object-oriented models have a structure and also a behaviour which are so called object-oriented. This is possible thanks to properties such as encapsulation and late binding. For these reasons object oriented analysis supports object oriented models as well as semantic ones.

Knowledge Representation

The following characteristics of the knowledge representation languages (Borgida, 1990) are very useful for the development of a GIS data model:

i. Expressive power
ii. Easy comprehension of system knowledge;
iii. Easy Access to system knowledge.

THE FORMALIZATION OF THE GIS CONCEPTUAL MODELLING

We have defined and formalised our GIS conceptual modelling by personalising and extending a language called Telos (Mylopoulos, et al. 1990), (Mylopoulos, 1991). It was developed and tested at the University of Toronto with the support of the University of Passau. It has been utilised for some European research projects. Telos was created for formalising conceptual modelling utilising features of semantic modelling and of knowledge representation. Telos is not a programming language, it is useful for the formalization of a knowledge Base. Its components are reported briefly in order to understand the following pseudo-codes.

  G[s]

The knowledge base is composed of structured objects based on 2 primitives: individuals and attributes. The individuals represent the entities, while the attributes represent the relations between two entities. The individuals and the attributes are managed and structured in the same way and are called propositions. Every proposition has 4 components called: Source, Label, Target, Interval; the first 3 components formulate a precise relation between 2 propositions, while the fourth one is the life-time of the relation. For example the proposition:

  [Arno, MorePollutedThan,Sieve,1992]

means: "The Arno river was more polluted than the Sieve river in 1992". The individuals are particular propositions because for an individual with Label p we have:

  Source(p) = Target(p) = Label(p) = p; for example

  [Arno, Arno, Arno, AllTime].

We can organise the propositions along 3 dimensions which are well known: aggregation, classification and generalisation. For the classification we have:

  Tokens --> propositions without instances: they are the
             concrete entities;
  SimpleClass --> the occurrences of these propositions are
             Tokens;

[End Page 901]


  MetaClass --> the occurrences of these propositions are
             SimpleClasses; 
  MetametaClass --> the occurrences of these propositions are
             MetaClasses;

We have a semantic network with one different plane for each abstract level. It is possible to consider 2 temporal dimensions: the history time (referred to the application domain) and the belief time (referred to the system knowledge).

  G[r]

The specification rules are expressed by means of a logic first order language (Mylopoulos et al., 1990). Its expressions are occurrences of the class: AssertionClass. The 2 MetaClasses with the following labels: ConstraintRule, DeductiveRule allow the classification of the attributes that have the AssertionClass as Target.

  O

The elementary and self-explaining operations are: TELL, ASK, RETELL, RETRIEVE. The most important is the TELL operation by which it is possible to define every proposition; for example:

  TELL CLASS DistributedData IN EnvironmentalData, MetaClass 
    WITH ATTRIBUTES 
      SpatialAttribute 
        Georeferencing: SimpleClass; 
        MBR: SimpleClass; 
        SpatialDistribution: SimpleClass 
      Quality 
        SpatialAccuracy: SimpleClass; 
        SpatialPrecision: SimpleClass
  END

THE CONCEPTUAL MODELLING FOR A GENERAL ENVIRONMENTAL MONITORING INFORMATION SYSTEM

This example has a general value: a very complex reality is conceptualised in order to monitor some physical situations; the managed situations consider both geographical and geological characteristics with the fundamental support of environmental monitoring measurements acquired from different sources (distributed, punctual, static and dynamic data). The subject world, which has been conceptualised, may be reduced and reused for other applications or may be a good framework for stressing some reality aspects. In particular this conceptual model provides a very high level of integration between raster and vector structure, which is very useful for many statistical and distributed models as well as much data processing such as classification, data interpretations, etc.

Figure 1 reports the integration level: for every raster layer it is possible to determine a new layer, with the same format and in this layer the locations belonging to a territorial object described (in a spatial and geometrical way) in a vectorial layer, are functions of the corresponding locations belonging to the starting raster layer.

When considering a class of object instead of a single one, it is only necessary to consider the object class as a sub-set of the territorial object set.

SUBJECT WORLD

Firstly we report a graphical notation in figure 2 -which is a custom extension of the object-oriented graphical analysis (Coad and Jordan, 1991)- by mean of which we present the characterised subjects that constitute the Subject World of the system. A set of subjects, extracted from the conceptual system formalization, are reported both in a graphical way and in the formalised manner.

The conceptual modelling offers wide opportunities to GISs. The Geographical information Systems may be implemented according to any commercial solution, but if they are built on top of a conceptual model framework, such as those pointed out, it will be possible to consider many additional possibilities, such as:

[End Page 902]


i. To implement metaquery about system territorial entities and their descriptions;
ii. To implement navigation semantically along the system entities and their attributes;
iii. To implement a self-documenting and self-describing GIS;
iv. To consider the conceptual model as a global scheme and to implement a multidatabase and a distributed GIS.

[End Page 903]


Our model provides all these facilities; they are not explicitly reported in order to keep this report brief, but from the formalization it is quite easy to obtain the semantic generalisation/specification structures.

Individuals

  TELL CLASS EnvironData IN MetaMetaClass
    WITH ATTRIBUTE
      Theme: MetaClass; % What?
      Quality: MetaClass; % How?
      SpatialAttribute: MetaClass; % Where?
      TimeAttribute: MetaClass % When?
  END

  TELL CLASS DistributedData IN EnvironData, MetaClass
    WITH ATTRIBUTE
      SpatialAttribute
        Georeferenzation: SimpleClass;
        MBR: SimpleClass;
        SpatialDistribution: SimpleClass
      Quality
        SpatialAccuracy: SimpleClass;
        SpatialPrecision: SimpleClass
  END

  TELL CLASS MonodimensionalData IN EnvironData, MetaClass
    WITH ATTRIBUTE
      SpatialAttribute
        Georeferenzation: SimpleClass;
        Location: SimpleClass
      Quality
        SpatialAccuracy: SimpleClass
  END

  TELL CLASS GeoreferencedMap
      IN DistributedData, EntityClass

[End Page 904]


    WITH ATTRIBUTE
      Georeferenzation
        Projection: "Geographic"
      MBR
        LambdaNordOvest: 0...;
        PhiNordOvest: 0...;
        LambdaSudEst: 0...;
        PhiBassoSudEst: 0...
      SpatialAccuracy
        GeoreferenzationAccuracy: 0...
      ConstraintRule
        :$This.LambdaNordOvest <
           This.LambdaSudEst$
        :$This.PhiNordOvest < 
           This.PhiSudEst$
      DerivedData
        Area $CalculateGeoArea$
        Orientation: $CalculateGeoOrientation$
  END

  TELL CLASS Arc IN EntityClass
    WITH ATTRIBUTE
      Single, Necessary, Part
        Start: Node;
        End: Node
      ConstraintRule
        :$This.Start != This.End$
      DeductiveRule

[End Page 905]


        Length: $CalculateEuclidianDistance$
  END

  TELL CLASS LinearUnit IN EntityClass, CartographicUnit 
    WITH ATTRIBUTE
      Association, TopologicalRelationship 
        BranchOf: LinearUnit; 
        EdgeOf: LinearUnit; 
        EndsIn: LinearUnit; 
        Intersects: Intersection; 
        Crosses: Cross 
      GraphicAspect, Part 
        LineType: 0....;
        LineThickness: 0...
      DerivedData
        Length: %CalculateLineLength$
  END

We have defined and formalised many individuals at each abstraction level, reported in about 11 Subjects.

  TELL CLASS DTM
      IN NearStaticData DigitalizedData, EntityClass
      ISA CompositeMap. RasterMap
    WITH ATTRIBUTE
      TimeReference

[End Page 906]


        DTMDate: Date&Time
      DigitalizedParameter
        TerritorialParameter: StructuralParameter
      DigitalizedParameterAccuracy
        TerritorialParameterAccuracy: 0...
      Single, Necessary, Association 
        CartographicModel: CartographicModel
      .....
      ....
      ConstraintRule
        :$ForAll d/DTM Exists m/CartographicModel >
          d.CartographicModel = m$
        :$ForAll d/DTM Exists m/StructuralParameter >
          d.TerritorialParameter = m$
      .....
      ....
  END

  Activities

  TELL CLASS ObjectOperation IN ActivityClass
      ISA LocationOperation
    WITH ATTRIBUTE
      Input
        FirstLayer: CompositeMap;
        Object: TerritorialObject
      Output
        NewLayer: CompositeMap
      ConstraintRule
        :$This.Object  This.FirstLayer$
  END

  TELL CLASS ObjectMaximum
      IN ActivityClass ISA ObjectOperation
    WITH ATTRIBUTE
      Control
        Prototype: NewLayer =
          ObjectMaximum of FirstLayer
          WITHIN Object
  END

REFERENCES

ANSI/X3/SPARC 1975), "Interim report of the study group on database management systems", FDT (ACM SIGMOND Bulletin).

Borgida, A. (1990), Knowledge Representation and Semantic Data Modelling: Similarities and Differences, Proc. Entity-Relationship Conference, Geneve.

Coad, P., Yourdan, E. (1991). Object Oriented Analysis. 2nd edition Prentice Hall. Cowen, DJ. (1988). GIS versus CAD versus DBMS: what are the differences?, Photogrammetric Engineering and Remote Sensing 54: 1551-4.

Gray, P.M.D., Kulkarni, K.G.. Paton, N.W. (1992), Object-Oriented Databases A Semantic Data Model Approach. Prentice Hall

[End Page 907]


Healey, R.G. (1991), Database management systems. In: Maguire DJ., Goodchild M.F., Rhind D.W. (ed.), Geographical Information Systems: principles and applications, pp. 23-38, Vol.2. Longman: London.

Maguire, DJ. (1991), An overview and definition of GIS. Geographical Information Systems, pp.9-20. vol.1. Longman: London.

Mylopoulos, J., Borgida, A.. Jarke, M., Koubarakis, M. (1990), Telos: Representing Knowledge About Information Systems, ACM Transaction on Information Systems, Vol 8, No.4, pp. 325-362.

Mylopoulos, J. (1991), Conceptual Modelling and Telos. DKBS-TR-91-3, University of Toronto.

Molenaar, M.(199I), Status and problems of geographical information systems. The necessity of a geoinformation theory, Elsevier Science Publisher B.V pp. 85-103. Peuquet, DJ. (1984). A conceptual framework and comparison of spatial data models, Cartographica 21: 66- 113.

Tsichritzis, D.C., Lochovsky, F.H. (1982), Data Models. Prentice-Hall

Wintraecken, JJ.V.R.(1990), NIAM Information Analysis Method. Kluer Acad. Publ. Group, Dordrecht.

[End Page 908]