Creating LULC Maps — Chris Sostad

Tools: Catalyst

Skills: Classification, Training

Creating a Land Use Land Classification Map using remotely sensed images of Vancouver, B.C.

Multispectral Classification is a process where you sort pixels into groups of classes based on training areas. You start by telling the algorithm that areas that contain homogenous values of pixels represent certain features (snow, trees, water, urban) and the software will then process the image into coloured groups and display them in defined classes. This is very useful for creating Land Use Land Classification (LULC) maps.

For this lab, I my goal was to create a land use land classification map and export the classes as a shapefile along with an attribute table.

I loaded the ETM 5, ETM4 and ETM 2 bands into Red, Green, Blue respectively and created an empty channel to set up my training areas in. Training involves drawing polygons around groups of similar pixels of known landcover types. I was very fortunate to have grown up in Vancouver so understanding the various types of land cover in the region was no issue to me.

You can see from the image on the left where I have selected groups of pixels and coloured them into classes. For example, the dark blue squares on the Capilano Reservoir are areas I know are clear water. Brown squares are coniferous trees, Orange are low density urban areas and cyan are highly built up urban areas. In each of the examples, the Training Editor function in the software sets the pixels in these polygon to a new value. eg blue = 1, brown = 2…. Later the algorithm will “learn” that all existing pixels that appear to be trees should be reclassified to blue polygons, or brown polygons and so on.

Note: On my first attempt on this image, my polygons were too large as I would later discover. Smaller, more numerous training areas are better.

Setting up the training editor

I decided on 10 different classes of land cover as you can see in the image to the left. The number in the value column will be assigned to all Digital Number values of the pixels with the polygon that I draw. For example, if I draw a polygon around an area that I know to be grass and the DN values of the pixels in that polygon are x, y and z. The program will find all pixels in the whole image that have the values x, y and z and assign them to the value 1 and the colour green (telling us that all of these areas are grass).

TRAINING SITE EVALUATION

After my first round of training I evaluated the training sites by viewing the scatter plot (left). Each ellipse represents one of the 9 training areas and the pixel values contained within it. The x in the middle is the mean value. The signature separability values are not good. (There is too much overlap between the classes) For example, there will be confusion as to whether pixels are paved surfaces or sediment laden water. At this point I returned to my training polygons and refined them by making them smaller and more numerous to get a larger sample size.

Note: This process should be repeated until you achieve the desired result. In order to stay within my allotted time frame I could only do one round of retraining.

Here you can see the results of my refined training values. By creating smaller more numerous training sites I have achieved higher signature separability. Mostly noticeable where the Channel 4 - 60 value meets the Channel 3 - 75 value. There is still some overlap in these areas which makes sense because trying to distinguish between paved surfaces and built up urban environment is very difficult. Revisiting the classes and possibly combining these two would have been a suitable solution.

Histograms showing the unimodal distribution of DN values for grass and deciduous trees

Testing signature separability - A zero indicates complete overlap between the signatures of two classes; a two indicates a complete separation between the two classes and anything in between is partial overlap. The invalid signatures should be corrected, either by collecting additional training pixels, or by merging them with other signatures. In the above example we can see poor separability between paved surfaces class and the build up environment class as well as the deciduous trees class and the urban agriculture class. Both as to be expected

SIGNATURE SEPARABILITY

PREVIEWING THE CLASSIFICATION

Results running the Maximum Likelihood Classification. Too many of the areas were classified as grass. I only wish Vancouver had that much grass.

Results after running the Minimum Distance Classification. The urban areas and paved areas are well defined (gray and violet). The amount of grass classification has been greatly reduced, however, there is too much coniferous forest (brown) in the suburban areas.

Results of Parallelipiped with MLC Tiebreaker Classification. Built up urban and paved streets are well defined. The agricultural area in Richmond (south) is visible. Coniferous forests have been limited to Stanley Park, the North Shore mountains and Pacific Spirit Park. Deciduous trees are found on cutlines, ski slopes and river valleys. This is the best classification of the three. Under normal circumstances I would return to the training phase and continue adjusting my training areas until the statistical analysis met the requirements but for the purpose of example, in this case, I moved on to post classification.

Post Classification

Once happy with the classification I continued on with post classification. A 5x5 Mode Filter was applied which takes the most occurring value within the kernel and creates classes to ~2 hectares. This gave it a smoothing affect in preparation for exporting the classes as a polygon shape file for use in other projects or software.

Using the AREAREPORT tool in Catalyst, I summed up the area of each class. This helps give a numeric value to how much land is dedicated to each use.

The last step was to export the classes as polygons and assign attributes to them. Using the RAS2POLY tool, vectors are created which could then be used in popular GIS software such as ArcGIS Pro or QGIS.

In reflection, the process of creating a LULC thematic map involved several steps. First I defined a supervised classification, then I created the training sites. Next I evaluated those training sites by previewing the different algorithms and refined some of my training polygons. I then performed the classification and then created my outputs in post classification.

The overall output was not bad. Given more time I would go back and refine my training sites by adding more and smaller sites. I would also reconsider my classes depending on what the purpose of the LULC map was. For example, if we are trying to focus on vegetation cover, I could focus more on the training sites for urban agriculture, deciduous trees, and grass cover. This map worked well in its differentiation of roads and built up urban areas but could use more clarification in the North Shore Mountains where there is exposed rock present but has been classified as built up urban environment.

In Conclusion

Course content by Josh MacDougall