The classification of the vegetation is based on the spectral images. Therefore, beforehand in ArcGIS selected training areas determine the spectral signatures as explained in this tutorial.
Using the GUI, or the main R-script will allow you to run the classification without any problems. If you do wish to adapt any detail within the script the following tutorials will help you with that. When running them separately, make sure you have the following packages installed:
- leaflet
- gwidgets
- gwidgetstcltk
- raster
- rgdal
- rLiDAR
You can install these packages with the command: install.package(packagename)
. Before getting started, you should also make sure that your working directory is set to the proper location. You can do this with the following command: setwd("D:/mainfoldername/subfolder/...").
In order to classify the vegetation in the study area, an areal photograph needs to be added and one training dataset has to be imported. Adding a second year and a LiDAR dataset of the same area is optional.
Firstly, the image is processed. It is aggregated to a lower resolution to speed up the computation time. If the aggregation resolution should be finer, this can be changed in process_raster.R. The original grid cells are 0.25 x 0.25 m.
Next the Normalized Differentiated Vegetation Index (NDVI) is calculated. This is done because the NIR band of the electromagnetic spectrum is very deterministic for plant varieties due to differences in the absorption spectrum. Therefore, calculating the NDVI, which is a function of the NIR and the red band, will improve the classification.
process_raster <- function(raster) {
if (raster@file@nbands==4)
{
raster <- aggregate(raster, 4, fun = mean)
raster@crs <- CRS("+init=epsg:28992")
names(raster) <- c("blue", "green", "red", "NIR")
ndvi <- overlay(raster$NIR, raster$red, fun=function(x,y){(x-y)/(x+y)})
raster <- addLayer(raster, ndvi)
names(raster)[length(names(raster))]<-"ndvi"
}
return(raster)
}
For more detailed classification LiDAR data can be included. If this is done, the following script will be included. The grid raster of the extent file is used for processing of the information of the point cloud data. In the AHN data some parts are determined as ground elevation. This part will not be used for calculation. For each of the cells the standard deviation, the amount of points, and maximum elevation of the points are calculated and added to the extent raster as bands. This information will be used in the classification.
addLIDAR<- function(raster,Lidar){
prj_string_RD <- CRS("+init=epsg:28992")
points_df <- SpatialPointsDataFrame(Lidar[,1:2],data=data.frame(Lidar[,3]),proj4string =prj_string_RD)
points_count<-rasterize(points_df,raster,fun="count")
points_max<-rasterize(points_df,raster,fun="max")
points_sd<-rasterize(points_df,raster,fun=function(x,...)sd(x))
points_count[is.na(points_count)] <- 0
points_max[is.na(points_max)] <- 0
points_sd[is.na(points_sd)] <- 0
points_count<-mask(points_count,raster[[1]])
points_max<-mask(points_max,raster[[1]])
points_sd<-mask(points_sd,raster[[1]])
raster_LIDAR<-brick(list(raster,points_count[[2]],points_max[[2]],points_sd[[2]]))
names(raster_LIDAR)[(length(names(raster_LIDAR))-2):length(names(raster_LIDAR))]<-c("LIDAR_counts","LIDAR_max","LIDAR_sd")
return(raster_LIDAR)
}
The training pixel as created earlier have distinguishable spectra due to their vegetation or land cover. Every single pixel has a specific signature determined by all used bands. These deterministic properties as assigned to a class are used to assign each a class to each of the pixels, as described here. The algorithm below does this assigning, using a random forest.
generate_RFModel <- function(covs, trainingPoly, nclasses) {
nbands<-nlayers(covs)
classes <- rasterize(trainingPoly, covs, field='code')
covmasked <- mask(covs, classes)
names(classes) <- "class"
trainingbrick <- addLayer(covmasked, classes)
valuetable <- getValues(trainingbrick)
valuetable <- na.omit(valuetable)
valuetable <- as.data.frame(valuetable)
valuetable$class <- factor(valuetable$class, levels = c(1:nclasses))
modelRF <- randomForest(x=valuetable[ ,1:nbands], y=valuetable$class,
importance = TRUE)
return(modelRF)
}
With information of the model of the random forest classification, a vegetation classification map of the entire study area is predicted and returned as the output from the following script.
train_exist <- function(image, RFmodel) {
names(image) <- c("blue", "green", "red", "NIR")
ndvi <- overlay(image$NIR, image$red, fun=function(x,y){(x-y)/(x+y)})
covs <- addLayer(image, ndvi)
predLC <- predict(covs, model=RFmodel, na.rm=TRUE)
return(predLC)
}
For avoid errors caused by shadows the maps are smoothed with a moving window. For example, on grassy areas which are not completely smooth spectral signatures of shadows in some pixel may be classified as shrubs or trees. To avoid this, we average the map according by removing those pixels which are surrounded by pixels of one other class. The file result is in this case the result of out classification that is to be filtered.
filter <- focal(result, w=matrix(1,3,3), fun=modal)
The classification should result in a vegetation map similar to the following one. In this case, you can see the difference between the image with and without filter.