ASPRS/ACSM (1994), copyright ASPRS/ACSM


SCENE CLASSIFICATION USING MULTISPECTRAL DATA FUSION ALGORITHMS

Laurence E. Lazofson and Thomas J. Kuzma, PhD
Battelle
505 King Avenue
Columbus, OH 43201

ABSTRACT

A variety of classification algorithms including artificial neural networks and data clustering techniques were successfully optimized to perform pixel-level classification of imagery in complex scenes using simultaneous multispectral measurements covering the UV, visible, near-IR, MWIR, and LWIR wavebands. Imaged scenes were comprised of tactical targets, buildings, roads, aircraft runways, and vegetation. Imagery were /Spatially coregistered with an RMS error on the order of 0. 5 pixels. Algorithms implemented in the study included unsupervised maximum likelihood, Linde Buzo Gray, and "fuzzy" clustering algorithms along with Multilayer Perceptron and Learning Vector Quantization (LVQ) neural networks. Supervised clustering of the data was also used. To further verify classification robustness, algorithms were tested on imagery recorded over broad periods of time throughout the day. Results were excellent, indicating that scene classification is achievable despite temporal signature variations. Waveband saliency analyses were performed to determine which spectral bands contained the bulk of the discriminating information for discerning objects in the scenes.

INTRODUCTION

The spectral distribution of energy reflected and/or emitted from an object is a unique characteristic that may be exploited to discriminate the object of interest from the local background. Multispectral sensor fusion techniques can be used to solve a variety of classification and discrimination problems. Battelle's portable multispectral sensor suite has been used to collect simultaneous, multispectral imagery of ground target and background object signatures over a full diurnal cycle in visible, infrared, and ultraviolet spectrally filtered wavebands. Implementing tailored data fusion algorithms, Battelle has processed some of the recorded imagery to solve classification problems including automatic detection of ground targets and location of aircraft landing zones. Imagery collected at different times throughout the day were employed to verify algorithm robustness with respect to temporal variations of spectral signatures.

Algorithms implemented in the study included unsupervised maximum likelihood, Linde Buzo Gray, and "fuzzy" clustering algorithms along with Multilayer Perceptron and Learning Vector Quantization (LVQ) artificial neural networks.

[End Page 168]


Supervised clustering of the data was also used. The algorithms were tailored to perform pixel-level classification of scene imagery. Imaged scenes were comprised of tactical targets, buildings, roads, aircraft runways, and vegetation. Imagery were /Spatially coregistered with an RMS error on the order of 0.5 pixels. Scenes classified by the data fusion algorithms were displayed with artificial color to assess algorithm performance.

Results of the study were excellent and indicated that the chosen data fusion algorithm was not critical to solving the object classification problem. Rather, the system solution lies in choosing the best set of features (wavebands) for discriminating the object of interest (i.e. target, runway, etc. ) from its background surroundings. With a set of wavebands judiciously chosen for discriminating the object of interest under established conditions, any of the mentioned algorithms can be tailored to use the multispectral information to effectively classify the imagery.

Upcoming dual-use investigations using this technology include the detection and identification of gaseous emissions for environmental monitoring and chemical/biological weapons treaty verification. Battelle plans to collect and process multispectral data of stack and fugitive gaseous emissions to apply data fusion in detecting byproduct molecular species generated during industrial processes. Also to be studied are medical imaging applications including the detection of retinal anomalies and multispectral endoscopic data fusion. Collected imagery will be used to apply sensor fusion techniques to aid in the detection of ocular abnormalities not readily discerned by visual inspection.

MULTISPECTRAL IMAGERY DATA COLLECTION AND PROCESSING

Battelle's portable sensor suite consists of two high-resolution, high-sensitivity thermal imagers and four charge-coupled-device (CCD) cameras. A variety of spectral filters and telescopic lenses accompany the sensor suite to enable rapid system reconfiguration to support many unique imaging and data collection requirements. With its collection of spectral filters, the Battelle sensor suite images over a large selection of wavebands in the visible, infrared, and ultraviolet regions of the spectrum. The CCD cameras generate 640x480 pixel images while the thermal imagers /Spatially sample a coarser resolution of 207x260 pixels in the MWIR and 207x344 pixels in the LWIR.

During two data collection episodes, Battelle's multispectral sensor suite was positioned looking downward from a 110-foot tower location. Imagery were then recorded over a period of several days. Feature-level sensor fusion was later accomplished off-line by feeding coregistered multispectral pixel intensity measurements into tailored data fusion algorithms.

In the initial multispectral data collection episode conducted in June 1992 at Wright-Patterson AFB, Ohio, Battelle recorded scene imagery of a mobile missile launcher amidst roads, buildings, trees, and grass over a full diurnal cycle. Figure

[End Page 169]


1 depicts a 35mm photograph of a scene containing the mobile missile launcher and ground clutter imaged simultaneously in six wavebands by the sensor suite. The multispectral imagery of this scene, classified by an artificial neural network, is displayed in Figure 2 with artificial color showing the different classes of objects (target, road, building, trees, and grass) identified by the network algorithm. To further verify robustness of the data fusion algorithm, it was tested on imagery recorded over broad periods of time throughout the day. Results were excellent, indicating that scene classification is achievable despite temporal signature variations.

In conjunction with the Federal Aviation Administration's Runway Detection Program, Battelle collected additional multispectral imagery in a separate measurement episode and processed the data using analysis and fusion techniques to detect a runway at Wright-Patterson AFB, discriminating it from other objects in the area using spectral characteristics. Figure 3 shows a 35mm photograph of a scene imaged in multiple wavebands consisting of a runway, roads, vegetation, and tactical targets at approximately 3 km. Figure 4 displays, in two colors, the binarized result of an unsupervised classification algorithm merging the multispectral data to segment, or detect, the runway.

DATA CLUSTERING ALGORITHMS

As a baseline, the study began with an investigation using an unsupervised maximum likelihood algorithm for clustering the multispectral data of a scene containing the mobile missile launcher. Displaying the clustered pixel classes with artificial color indicated that fusion of data from all six wavebands successfully distinguished the classes of interest. Using only the visual and near-infrared bands, the camouflage-green mobile missile launcher was difficult to discriminate from background trees. Employing data from the two thermal bands, the clustering algorithm confused the mobile missile launcher with paved road, but successfully separated vegetation from man-made objects. This may have been due to a measured 6 degrees C difference in apparent radiant temperature of vegetation in the LWIR band compared with the MWIR band, the MWIR band measuring a higher temperature. Man-made objects indicated higher apparent temperatures in the LWIR band. Combining the data from all six wavebands successfully clustered the classes of interest, distinguishing target pixels from background pixels as well as differentiating between vegetation and man-made objects. Pruning combinations of sensor inputs indicated that the green-filtered visual band and the LWIR thermal band together contained most of the key information for distinguishing the classes.

A version of the Linde Buzo Gray clustering algorithm was also used on the same multispectral scene imagery (Linde 1980). The results of this clustering algorithm were similar to results obtained using the other data fusion techniques.

A fuzzy clustering algorithm was also applied to the multispectral scene imagery containing the mobile missile launcher. Similar to other techniques, this algorithm

[End Page 170]


"carved" different object regions within the multispectral feature space, except that it allowed for overlapping class possibilities among the data clusters. In other words, a specific point in the multidimensional pattern recognition feature space may have been simultaneously designated as belonging to more than one cluster or object class, with weighted possibility factors pertaining to the "degree of belonging" to each class. However, the researcher must ultimately select a defuzzifying threshold to apply when making a final classification decision.

NEURAL NETWORK ARCHITECTURES AND TRAINING RESULTS

The scene employed in the initial investigation consisted of a mobile missile launcher parked on a grassy area in front of a grove of trees (Figure 1). Additional objects within the scene consisted of paved areas and buildings. To assess classification robustness, the tailored neural networks were trained and tested on near-simultaneous data from scenes imaged at different times of day. Learning coefficients, number of computational nodes, and node transfer functions were varied to optimize performance of a Multilayer Perceptron network and a modified Learning Vector Quantization (LVQ) network employing "conscience" and added training noise. Both architectures trained successfully, converging within several thousand training iterations to a 98% pixel classification accuracy on separate test data. As shown in Figure 2, trained network outputs of classified pixels from a full scene were displayed with artificial color to pictorially convey the near-perfect classification of the five classes of objects (mobile missile launcher, buildings, paved road, trees, and grass). Network architectures were developed and tailored with the NeuralWorks Professional II/PLUS software package by NeuralWare, Inc. Image pixel data were processed with the Geographic Resources Analysis Support System (GRASS), a Geographical Information System (GIS) with image processing capabilities.

The initial training data set consisted of 750 vectors (pixels) comprised of 150 pixels for each of the five classes. The corresponding test data set contained 250 vectors including 50 pixels for each class. The trained neural networks were later used to classify all 233,000 pixels imaged in a full scene, of which approximately 5600 were tactical target pixels. With only limited training, approximately 90% of the on-target pixels were correctly classified and over 99% of the background pixels were correctly classified as not being on target. The contiguity of the on-target pixels offers an advantage in that a few misclassified target pixels will not detract from the target segmentation decision. Image processing techniques, such as window averaging, were implemented to post-process the pixel-by-pixel target/nontarget classification output of the LVQ network. This postprocessing served to "clean-up" the few non-target pixels incorrectly classified as being on target.

SUPERVISED CLASSIFICATION

Another algorithm applied to the multispectral data set was a supervised classification algorithm. The supervised algorithm

[End Page 171]


generated a pixel intensity histogram in each waveband for pixels sitting on a user-designated object (i.e. target) within an imaged scene. Thresholds were selected near the tails of the histograms. Pixels were then classified as target pixels if the intensity values in each waveband fell within the established thresholds on the histograms.

CLASSIFICATION ROBUSTNESS FOR TEMPORAL SIGNATURE VARIATIONS

To further verify classification robustness, algorithms were tested on imagery recorded over broad periods of time throughout the day. Results were excellent, indicating that scene classification is achievable despite temporal signature variations.

WAVEBAND SALIENCY ANALYSIS

Waveband saliency analyses were performed to determine which spectral bands contained the bulk of the discriminating information for discerning objects in the scenes. Equally important, these analyses may be used to determine the optimum subset of wavebands for discriminating the problem phenomenology. Histograms and scatter plots of the multispectral data were used. For system implementation, the benefits of waveband saliency analyses and judicious pruning of features are clear as a reduction in the number of system sensors minimizes cost, weight, complexity, and processing requirements.

ONGOING EFFORTS

Results of the investigation are currently being implemented in a near-real-time hardware system. This system will consist of a field-portable sensor suite feeding multispectral measurements through a tailored data fusion algorithm to provide an enhanced visualization output on a flat-panel, color video display. This field-portable hardware may be used to further develop and validate sensor fusion algorithms for different applications.

REFERENCES

(1) DeRouin, Ed and Joe Brown. 1992, "Hyperspectral Data Fusion for Target Detection and Segmentation Using the Intel ETANN, " 1992 Digest of Papers, Government Microcircuit Applications Conference: 39-42.

(2) Geographic Resources Analysis Support System (GRASS) 4. O (Software), 1991, U. S. Army Construction Engineering Research Laboratory.

(3) Kuzma, Thomas J. and Laurence E. Lazofson. 1993, "Automatic Target Detection Using Multispectral Sensor Fusion Implemented with Neural Networks, " 1993 Automated Mission Planning Society Symposium, San Antonio, TX.

(4) Kuzma, Thomas J. and Laurence E. Lazofson. 1993, "Scene Classification and Segmentation Using Multispectral Sensor Fusion, " 1993 Meeting of the IRIS Specialty Group

[End Page 172]


on Passive Sensors, Applied Physics Laboratory/Johns Hopkins University, Laurel, MD.

(5) Lazofson, Laurence E. and Thomas J. Kuzma. 1993, "Scene Classification and Segmentation Using Multispectral Sensor Fusion Implemented with Neural Networks, " Sixth

National Symposium on Sensor Fusion, Orlando, FL.

(6) Lazofson, Laurence E. and Thomas J. Kuzma. 1993, "Scene Classification and Segmentation Using Multispectral Sensor Fusion Implemented with Neural Networks, " SPIE International Symposium on Optical Engineering and Photonics in Aerospace and Remote Sensing, Orlando, FL.

(7) Linde, Yoseph et al. 1980, "An Algorithm for Vector Quantizer Design," IEEE Transactions on Communications, Vol. COM-28, No. 1.

(8) NeuralWorks Professional II/PLUS Software, 1992, NeuralWare, Inc., Pittsburgh, PA.

(9) Rogers, Steven K. et al. 1990, An Introduction to Biological and Artificial Neural Networks. Bellingham Washington: SPIE.

(10) Seldin, J.H. and J.N Cederquist. 1992, "Classification of Multispectral Data: A Comparison Between Neural Network and Classical Techniques," Government Neural Network

Applications Workshop: 79-83.

(11) Sims, S. Richard F. and Belur V. Dasarathy. 1992, "Automatic Target Recognition Using a Passive Multisensor Suite," Optical Engineering, 31: 2584-2593.

[End Page 173]


[End Page 174]


[End Page 175]