URISA (1994), p179-192, copyright Urban and Regional Information Association
The Province of Manitoba in conjunction with LINNET Graphics International Inc., is currently developing the Manitoba Land Related Information System ("MLRIS "). A principle component of the MLRIS is the Information Utility ("IU") which consists of a database containing shareable land related information, and a mechanism to receive and distribute this information upon demand.
The IU is an organization whose function is to collect, organize, store, and distribute land related information. Raw information is collected from government departments and private agencies which is then imported into the central IU database. A data directory is maintained to provide index and meta data information to potential buyers of the data. Security is provided to ensure that shareable land related information is visible and obtainable only to authorized users who are then charged a fee for the processing and handling of the information. Royalty payments are made to the original owners of the data upon the sale of information giving incentive to keeping the data current and accurate, as well as providing a return on the initial investment of capturing the data.
The main business focus of the IU is to streamline the process of sharing land related information between government departments and private agencies, primarily utility companies. The IU concept addresses issues that each of these government departments and private agencies currently face. Issues such as inconsistent data file formats and inadequate translators, the inability to locate and collect data easily, will be addressed by the IU. By centralizing the data, providing access to distributed databases, imposing common data standards, and supplying data format translation, the IU will greatly improve the utilization of land related information within the Province of Manitoba and will permit end users to concentrate on conducting business, rather than processing information.
[End Page 179]
This paper discusses the solution that the Province of Manitoba has implemented to overcome the issues regarding the collection and distribution of land related data between Government Departments and Utility Agencies. The solution is called the Information Utility. Its mandate is to make shareable land related information accessible to Manitobans.
In this paper we will
The Manitoba Land Related Information System (MLRIS) is an agreement between Manitoba's government departments and utility companies to collect land-based information to a specified standard, and provide this data to an Information Utility (ID) to be stored and distributed on demand to authorized users. A common base map is used as the geodetic control base of all other land-based data.
In addition to providing the data-exchange facility to end users, the IU will exist as a single source for land-related data. Users won't need to search for foreign datasets, whose existence is often unknown, thus avoiding duplicate collections of land-based information. A streamlined data-exchange process allows users to concentrate on acquiring foreign data and incorporating it into analysis and decision-making processes. Collecting data according to provincially accepted standards ensures that it is accurate and of high quality.
The IU is a massive database that stores spatial data as vector, attribute, and image data. The database is managed by a spatial database manager and Oracle relational database management system (RDBMS), along with inhouse-developed system utilities for importing, exporting, and cataloguing the data.
Figure 1 shows the IU system architecture, which consists of a number of interconnected modules that provide data indexing, transfer, viewing, and analysis functions.
The host system is an RS/6000-based processor running AIX with a large amount of online storage. All vector and attribute data are stored online in the RDBMS controlled by the GIS engine; all image data are stored on CD-ROM.
The IU database is actually a composite of many databases, including the enterprise database, which contains the following: the data the IU is capable of serving, an observation database
[End Page 180]
[End Page 181]
containing a subset of data that end users can view directly, and a thematic database used for custom mapping and analysis.
Data is moved in and out of the enterprise database through an interchange facility which interprets an incoming datastream and translate it to a neutral format to be stored on the enterprise database. Conversely, it can translate data destined for a foreign system to the proper exchange format.
The data-directory module acts as the catalog and administration component of the IU. Popkin Software's System Architect is used to build and store data profiles in a data dictionary. User profiles and user privileges are maintained to assist in the rooting of export data and preventing unauthorized access to proprietary or sensitive information. An embedded accounting module provides statistics required for usage billing and royalty disbursements. An e-mail subsystem enables communication with the IU system administration for dataset-retrieval requests and/or data profile and meta-data exchange. Various administration modules assist the IU administrator in processing data-retrieval requests and maintaining the data directory.
A communications facility connects the IU to the outside world for dial-up access to the IU via public telephone network data lines. Once connected to the IU, the end user can access the data directory for an index to the IU database contents, as well as metadata for any available datasets. By connecting to the Observation database through an interface program called PROBE, the user can view graphics, key maps, or other data classified as "high-demand" data. This permits the user to make graphic queries on selected features or view the IU database through a visual interface.
As discussed in the following paragraphs, the structure of a spatial dataset can be quite complex and the transfer of a spatial dataset from one GIS system to another is often not a trivial task. This presents a dilemma to agencies wishing to share and exchange such datasets on a regular basis. For this primary reason, the concept of the Information Utility was formed. Overcoming the difficulties in sharing spatial information would ultimately lead to better utilization of resources, better decision making and reduction in redundant data management activities.
Figure 2 is an example of graphically represented information, along with corresponding attribute information. The graphic component defines the object(s) in terms of its spatial characteristics - shape, size, coordinates, color, line style, and the like. This information is usually stored in a proprietary file format that's viewed and manipulated with CAD or GIS software. The attribute component describes the object(s) in terms of its measured characteristic. The attribute data is generally stored in tabular format in a relational database, then viewed and manipulated with SQL (Structured Query Language).
[End Page 182]
[End Page 183]
Complete exchange of data between GISs requires that both the graphic and attribute components be transferred. Figure 2 also shows the datastream describing the graphic and attribute data being transferred. The data stream consists of:
Most GISs employ a proprietary internal file structure controlled by the GIS engine. These systems generally provide a mechanism to import and export a growing number of defacto formats, allowing for limited movement of data between foreign, albeit similar, systems. Without a correct import or export format, however, data transfer to a foreign system is cumbersome. Exchanging spatial data for which a direct exchange format isn't available involves a programming project, and more than one translation is unachievable for most data- processing shops.
If a common exchange format doesn't exist, then a customized translator must be written; see Figure 3. The number of translators required to facilitate data exchange between all formats is n*(n-1). As Figure 3 illustrates, data exchange between four systems employing different data formats would require 12 translators. A better approach is to combine existing data-exchange format with custom-developed formats to reduce the number of translators. It's also possible to move data through a multistep translation process; see Figure 4.
The ideal solution to the data-exchange dilemma is a single, neutral data format-one that translates all other formats to and from the neutral format. As Figure 5 shows, the number of translators would be n*2, and the development effort would be shifted to the vendors of the original data format, not the end users.
This solution is difficult because it requires support from all developers and vendors. Progress is being made in this direction with the acceptance of industry-standard formats such as SAIF and DIGEST. For now, however, the best solution is to use existing standards where possible, combined with custom-developed translation mechanisms.
[End Page 184]
[End Page 185]
The foundation of the IU database is a basemap that is currently under construction for the Province of Manitoba. The basemap contains the reference layer captured with a high degree of accuracy to which all other data layers will be associated. The common basemap ensures that all government data and utility information is stored in the same coordinate system and map projection and allows the user to overlay the basemap with other data layers accurately. This was made possible by the development and acceptance of data collection standards by all participants of the MLRIS. In summary the IU database basemap consists of
The development and maintenance of the basemap layers are shared between government departments and participants of the MLRIS.
The IU database also contains a growing number of datasets that are deemed shareable by the source agencies. These include primarily the utilities data layers such as hydro distribution network, water and sewer, gas, and buried communication cables. A goal of the IU facility is to provide a means to construct a composite of all underground structures upon demand.
Other datasets containing environmental data such as soils, vegetation, hydrology etc. are also being collected and stored in the IU database.
Importing data into the IU database is a multistep process involving a database administrator. Two issues are important during data preparation: The data must be properly identified and described to maintain a meaningful data directory, and the source data must be reorganized into a neutral format for storage and future distribution.
Figure 6 shows the graphic representation of data to be imported and attribute information maintained by the data provider. In this example, a subdivision survey plan will be imported, along with key information (primary table) that identifies each subdivision parcel, and owner and assessment data for each parcel (indirect table). The indirect table contains information that doesn't directly describe the feature object, but contains measurements that may or may not exist
[End Page 186]
[End Page 187]
for the feature objects. A primary table, on the other hand, must contain only one identifier record for each feature object.
The import procedure begins with the preparation of the data model; see Figure 7. To create the model, enough information must be received from the data provider about the dataset(s) to be imported. Using a CASE tool, the data model is constructed, which defines the entities for which the data provider is supplying data. Data structures are built for each entity, defining all of the data elements, and metadata that describes each entity is captured. Metadata capture follows a standard that sufficiently describes the data entities within the IU database. The data model describes the features comprising the subdivision survey plan and associated attribute data.
The purpose for creating a data model is twofold: to export a schema definition in the form of data definition language (DDL) to be used later for loading data into the IU database; and to capture information required by the data directory informing users as to available data. When the data model is complete, the metadata and the schema definition are exported and used as input to a load procedure that populates the data directory.
The graphic-data import process involves importing a data file that describes the spatial constructs of the features. The data provider extracts from the corporate database a graphic data file, usually in a proprietary format, and sends it to the IU administrator. A translation process is determined and format-translation parameters are fed into the translation program along with the source input file. Under the MLRIS initiative, custom translation programs were developed for the exchanging data between: AutoCAD (DXF), GDS and Vision. The translation parameters that the administrator creates depend on the features described by the input dataset. For example, feature codes determined by the data provider are usually reclassified to adhere to standards developed by the IU administrator. The translation process produces a data file in a neutral format that describes the spatial orientation of each feature and its primary identifier or database key. In our example, the graphic data file consists of point, line, and centroid information describing the spatial composition of the subdivision survey plan, along with the feature code associated with the survey parcels and the parcel identifier attached to each parcel centroid.
Next, a load-preprocessor program is executed, which ensures that the source input file is acceptable for loading into the IU database. This process verifies that the source input file does not contain any unwanted or undefined features, as described in the data directory in the prior steps. If exceptions are found, the load process can't continue until the data model is revised to include the missing features. The preprocessor also attaches a standard header to the input file to ensure that each dataset is loaded with the same coordinate-projection information and other key database load parameters.
Finally, the preprocessor program collects statistics and counts of features and creates an area-coverage index of each dataset processed. An area-coverage index is created by matching the coordinate information for each feature against a master tile grid (see Figure 8), where each tile is predetermined in size and location and varies from 100 to 10,000 sq. km. An area-coverage index file is created that identifies each tile within Manitoba for which feature data is present in the dataset being processed. The resulting dataset area-coverage index is loaded into the IU
[End Page 188]
[End Page 189]
[End Page 190]
database so users can view graphically the area coverage of a dataset and retrieve data by a given tile number. The area-coverage index is also used for maintenance and retrieval of data from the database on a tile basis.
The final procedure in loading information into the IU database may involve the import of attribute date. Indirect attribute information is data that describes a measurement or occurrence of data associated with a feature. This information may or may not exist for each feature and is therefore stored and maintained separately from the primary table.
The data provider typically delivers the indirect attribute data file in fixed-length record ASCII format which is easily manipulated through SQL and reformatted in preparation for load into the IU database. (Reformatting is necessary only where MLRIS standards haven't been maintained by the data provider, or where key information is reorganized to improve storage and retrieval).
The corresponding schema definitions for tables to be created in the IU database are obtained from the previously constructed data model. An automatic process generates an import data file consisting of: comma-delimited records, a load-control file containing the SQL instructions for inserting records into a relational-database table, and DDL for creating the database objects necessary to store the indirect attribute information. The load process for the indirect attribute data is queued and executed in batch mode.
To ensure a successful data-load procedure, the IU administrator reviews the log files maintained during the load procedure, uses the GIS engine to display the graphic data, and queries and attached database records. Once verified, the data directory is updated one last time to indicate that the loaded datasets are available for distribution.
The IUACCESS module is a windows based application that an end user executes on a remote workstation to view the data directory describing the contents of the IU database. The user is presented with a list of data themes which are broad categories to which individual datasets are associated. For example the theme 'Agriculture' may contain datasets of soils information, farming practices, crop information etc. Some of these datasets may contain graphic features while others contain attribute data only, associated with an area feature. To understand the contents of each dataset, meta data is created and maintained for each dataset in order to properly describe its contents. The user can select from a list of available meta data characteristics for a chosen dataset to help determine whether the dataset contains the information he desires. In some situations a dataset may be stored in tiles such as map sheets or multiple relational tables. In this case the user can request and retrieve data for a specific tile or dataset, thus minimizing the quantity and cost of data to be retrieved.
Upon determination of the datasets a user wishes to retrieve from the IU, as request file is constructed and sent via modem to the IU host system where it is queued for processing. A request monitor function permits the user who initiated the data request to monitor the status of the processing.
[End Page 191]
In processing the dataset-retrieval request, the IU administrator must first extract the data from the IU enterprise database. This extract file is in a neutral format and must be translated to the format required by the destination system. The user profile, as stored in the data directory, contains the information and parameters required to deliver the dataset in the correct format. The database schema that corresponds to the dataset being delivered is exported from the data dictionary and integrated with the dataset. The final dataset can be delivered via modem or put on tape, diskette, or CD-ROM. The end user who receives the dataset is then able to incorporate the dataset into his own GIS environment for processing.
To satisfy the users who are interested in obtaining information about individual feature objects, a remote graphical query application has been constructed called PROBE. This application permits a user to connect to the IU application server via a modem and view the cadastral basemap for a specific area. Predefined query functions are available for the user to obtain property information such as assessment data, real estate information and Land Titles information given a property address, owner name, certificate of title number, assessment roll number or other permissible property identifier. Alternatively the user may point to a feature object and retrieve information about that object. Pan and zoom functions allow the user to view graphically any area within the database. Complex queries can be constructed that include spatial constraints such as creating a query within a given buffer zone around a known feature.
PROBE offers a restricted query and GIS function to the end users who don't have or need a GIS of their own but require access to land related information in this manner.
The Information Utility is Manitoba's answer to data-exchange dilemma. By using only one data exchange format for land-related data, government officials and private businesses can concentrate on using data to make business decisions instead of struggling to convert data into workable formats.
[End Page 192]