|
Combined use of NFI sample plots and Landsat TM data to provide forest information on municipality level
Arnt Kristian Gjertsen and Stein Tomter
Norwegian Institute of Land Inventory
P.O.Box 115, Raveien 9 N-1431 AAS, Norway
Tel: +47 64 94 97 16 / Fax: +47 64 94 97 86, Email: arnt.gjertsen@nijos.no
Erkki Tomppo
Finnish Forest Research Institute
Unioninkatu 40 A FIN-00170 HELSINKI, Finland
Tel: +358 9 8570 5340 / Fax: +358 9 625 308, Email: erkki.tomppo@metla.fi
Abstract
The Norwegian Institute of Land Inventory and the Finnish Forest Research Institute have tested a non-parametric, k-nearest neighbour method for calculating forest statistics and maps based on data from the National Forest Inventory (NFI), remotely sensed data, and digital maps. The primary objective was to provide forest information for smaller areas than is possible with sample plots alone.
The test area was a 140km2 large municipality in southeast Norway with a total forest area of 87km2. A total of 491 NFI sample plots within a 150km by 150km large reference area around the municipality were used as a reference data set. The NFI plots are—depending on the attributes being observed—250m2 and 1000m2 large. A mid-summer Landsat Thematic Mapper (TM) image covering the municipality and the reference data set was acquired. The image was orthorectified using a digital elevation model and resampled to 25x25-m pixels. An independent sample survey of the municipality was used to assess the accuracy.
TM pixels closest to the centre locations of the reference plots were tied to the plots. Further, a land type map was used to mask out forestland from other land covers. For each TM pixel inside the forest mask, the three spectrally closest reference plots were found using Euclidean distance. The plots were weighted using the reciprocal of the squared distance, and the weights were normalised making their sum equal to 1. After running the process over all TM pixels inside the forest mask, all plots in the reference data set had weights attached reflecting spectrally how representative they were for the municipality. A fundamental assumption of the method is that spectral similarity implies similarity in forest condition; therefore, the success of the method relies on the correlation between the spectral and biotic attributes.
An estimate was defined as inside the tolerance if it fell within a 95% confidence interval around the independent control estimate. Promising results were obtained for area estimates of the following attributes: dominant tree species, top height, number of conifers, total number of trees, and mean height of young forest. When forestland was grouped into five maturity classes, area and volume estimates were inaccurate; but when generalising the forestland into three development classes, area estimates and volume estimates of spruce were promising. One possible improvement of the method by using information on site index from the land type map database will be tested in the near future.
Introduction
The National Forest Inventory (NFI) of Norway is based on sample plots laid out as a regular grid with 3-km distance between plots in east-west and north-south directions. The plots are, depending on the attributes being observed, 250m2 and 1000m2 large. The inventory cycle is five years, and each year one fifth of the plots are selected to be field measured. The plots are selected such that representative statistics are provided yearly.
The permanent plots are too sparse to provide accurate statistics for each of Norway’s 20 counties. Of this reason, the permanent plots are supplemented by temporary plots. One fifth of the counties are covered with temporary plots in an inventory cycle; i.e. counties are surveyed every 15th year. The NFI cannot provide accurate statistics for administrative regions smaller than counties, i.e. municipalities. A typical municipality in southeast Norway is covered with fewer than 20 permanent NFI plots. Aggregated data from forest management plans are usually not suitable or available to provide statistics or maps for municipalities.
The Norwegian Institute of Land Inventory has a defined objective to find new information applications of the NFI database. One objective is to estimate forest conditions of municipalities and other small regions, another to produce forest maps to show the pattern of the forest landscape in addition to the statistical information. It is of interest not only to quantify a certain forest type or condition inside a survey region, but also how it is distributed and located. The main idea supporting these goals is to utilise NFI plots from a wide reference area around the survey region (e.g. municipality) together with map and image data that have correlative relationships with the forest attributes of the plots. An important purpose of the image and map data is to measure how typical or similar the plots are relative to the survey region, and the idea is to express this similarity by calculating new area weights for the reference plots.
Method and material
A multi-source forest inventory (MSFI) method has been developed by Tomppo et al. that incorporates these ideas. Maps are used to mask out forest from non-forest areas in the survey region. In addition, maps may also be used to stratify the survey region and the plots in the reference area, e.g. into site classes or soil classes. Image data from earth observation satellites (e.g. Landsat Thematic Mapper) covering both the reference area and the survey region are tied to all reference plots, and image data are used to calculate area weights for the reference plots. Digital elevation models (DEM) can be used for several purposes, e.g. to correct image values for the terrain effect and to stratify the NFI sample plots into altitude zones.
Given all the necessary data sets for the inventory, MSFI employs a non-parametric, k-nearest-neighbour (k-nn) classification or estimation method based on three components: (a) a defined neighbourhood for each query pixel, (b) an algorithm that finds all the training pixels meeting the neighbourhood definition, and (c) a method to calculate an estimate based on the training pixels in the neighbourhood . In our case the neighbourhood was defined as the k spectrally closest reference plots, and the Euclidean metric was used to find the plots defining the neighbourhood. In order to calculate an estimate, weights were calculated for all plots in the neighbourhood. The weights were defined as the reciprocal of the squared Euclidean distance; thus, the closest neighbourhood plots cast higher weights than the more distant plots. The weights were normalised to make their sum equal 1. This process was repeated for all p pixels within the forest mask of the survey region; in the end a weight matrix resulted where the number of rows corresponded to the number of pixels in the forest mask and the number of columns to the number of reference plots. For each plot, a total weight C was calculated as the sum over all weights w produced by all p pixels for the particular plot; e.g. for plot j the weight was calculated as follows:

The weights C for all plots in the reference area sum up to the total number of pixels p, which when multiplied with the pixel area Ap equals the forest area of the survey region. The weights C for the individual NFI plots can be interpreted as area weights when multiplied with Ap. Thus, calculation of statistical information is based on the new area weights C. To produce maps, estimations are made for each individual image pixel based on its neighbourhood as follows:

where wp signifies the estimated quantity for pixel p, IP the set of plots belonging to the neighbourhood of pixel p, wpj the weight produced by pixel p for plot j, and yj the corresponding quantity observed on plot j. Mode values are used in stead of mean values for qualitative attributes.
The test area was a 140-km2 municipality in southeast Norway located about 40km south of Oslo. The forest area is 87km2. An independent inventory based on 1019 sample plots was conducted in 1997, the same year as the experiment. This inventory served as a control survey, to which the MSFI method was tested. About 650 NFI plots were precisely located with use of differential Global Positioning System (GPS) data. A TM image from 9 July 1997 was acquired. The image was orthorectified using a digital elevation model and resampled to 25x25-m pixels. Figure1 shows the location of the test area and some of the data sets involved. It was not possible to acquire a TM scene that covered all the GPS-located NFI plots, therefore only 491 plots were actually used. Of these, 19% was field measured in 1994, 15% in 1995, 20 % in 1996, and 46% in 1997. A large-scale land type map was used to create a forest mask for the municipality. In addition a road map was used to remove pixels from the forest mask that were considered mixed pixels. A DEM with 100m cell size was used in the orthorectification process, in a cosine correction for the terrain effect on illumination, and in stratification of the NFI plots into altitude zones (±125m) during the estimation process. A DEM with higher resolution was preferred but was not available for the project.
All forest attributes were grouped into discrete class intervals; thereafter, estimates of area or volume or both were calculated for each class interval. Evaluation was performed by simple comparison of the results and by use of statistical models. From the independent control survey accurate estimates were calculated for all forest attributes, and 95% confidence intervals were estimated for each class interval. The results from MSFI were tested against these confidence intervals. If an MSFI estimate fell within its corresponding confidence interval, no significant difference was observed between the two independent estimates. For those attributes where MSFI predicted at least 75% of the class intervals within the confidence limits, the match was considered promising. In addition, estimates were calculated based on the reference data set alone without the use of the TM image and the k-nn method. The purpose was to test if MSFI was an improvement.
Results and discussion
The correlation coefficients between TM data and different forest attributes are presented in Table 1. As can be seen, the correlation coefficients are significant and relatively high for total volume and volume of spruce (Picea abies) and relatively low for volume of pine (Pinus sylvestris) and birch (Betula sp.). Age and maturity classes show also significant and relatively high correlation coefficients. Tokola et al. found in their study similar values for plotwise correlation coefficients for similar small plots (i.e. d 1000m2). However, these plotwise correlation coefficients are lower than what have been found for standwise correlation coefficients. For instance, Ardö found correlation coefficients for volume that ranged between -0.48 and -0.79 for the six TM bands in the optical region.
Totally, 28 different area and volume estimates were tested. Area estimates of five attributes gave promising results, and these attributes were as follows: tree species (3 classes), top height (5 classes), number of conifers (9 classes), total number of trees (9 classes), and mean height of young forest (9 classes). The method did not give promising results for the following important forest attributes: site index, maturity classes (five classes: forest under regeneration, regenerated areas and young forest, young thinning stands, advanced thinning stands, mature forest), vegetation types, and crown density. Further, volume and growth estimates of spruce, pine, broadleaves, and all tree species combined grouped into the five maturity classes gave inaccurate results. However, total volume of spruce was satisfactorily estimated with 744,000m3 vs. 724,000m3 from the control survey and 591,000m3 from the estimation based on the reference data set alone. Total volume of pine and broadleaves were clearly overestimated. A simplification of the maturity classes into three development classes resulted in great improvement of the area estimates (Figure 2) and volume estimates of spruce (Figure 3). In Figure 4, the area estimates for tree species are presented. The vertical lines indicate the 95% confidence intervals, and it can be seen that the MSFI estimates for all three species are within the intervals.
The MSFI map with the five maturity classes was compared with a corresponding map from the stand inventory that took place in 1996. The two maps show high overall agreement. However, the MSFI map shows a much higher local variation, which can be an artefact of the method or natural variation. The last point will be investigated further.
Conclusion
There is no known theoretical method to calculate the expected accuracy of MSFI estimates; therefore, the accuracy has to be established through empirical experiments. In this study, MSFI estimates for a municipality were compared with an independent and accurate control survey.
Promising results were observed for five of the 28 tested area and volume estimates. To obtain satisfactory results for maturity classes, these had to be generalised into three development classes. Volume of spruce was inaccurately predicted when the forestland was grouped into five maturity classes, and this is related to the inaccurate area estimates. However, when forestland was generalised, area estimates and estimates of volume of spruce were strongly improved and very close to the control survey estimates. Also, the estimate of total volume of spruce was close to the control survey estimate, but total volume of pine and broadleaves were clearly overestimated.
A fundamental assumption of MSFI is that spectral similarity implies similarity in forest condition; therefore, the method relies heavily on correlative relationships between the biotic attributes and the spectral variables. When the relationship is weak, a good result cannot be expected. The method also relies heavily on the presence of reference plots that have similar forest conditions as the query pixels of the survey area. Thus, the success of MSFI was a function of two factors: (a) correlation between spectral variables and forest attributes, and (b) representativeness of the reference plots with respect to the forest conditions in the survey region.
One key to improvement may be more extensive use of auxiliary information in defining the neighbourhood of each query pixel. The use of site index maps, if available, could be one such improvement. This will be tested in the near future.
Acknowledgements
Many thanks to Mr. Matti Katila and Ms. Helena Mäkelä at the Finnish Forest Research Institute for their help during this project.
References
|