PCA using raster datasets in R

空扰寡人 提交于 2019-12-01 17:34:37
user2414840

Answer to my own question: I ended up doing something slightly different: rather than using every raster cell as input (very large dataset), I took a sample of points, ran the PCA and then saved the output model so that I could make predictions for each grid cell…maybe not the best solution but it works:

rasters <- stack(myRasters)

sr <- sampleRandom(rasters, 5000) # sample 5000 random grid cells

# run PCA on random sample with correlation matrix
# retx=FALSE means don't save PCA scores 
pca <- prcomp(sr, scale=TRUE, retx=FALSE) 

# write PCA model to file 
dput(pca, file=paste("./climate/", name, "/", name, "_pca.csv", sep=""))

x <- predict(rasters, pca, index=1:6) # create new rasters based on PCA predictions

There is rasterPCA function in RStoolbox package http://bleutner.github.io/RStoolbox/rstbx-docu/rasterPCA.html

For example:

library('raster')
library('RStoolbox')
rasters <- stack(myRasters)

pca1 <- rasterPCA(rasters)
pca2 <- rasterPCA(rasters, nSamples = 5000)  # sample 5000 random grid cells
pca3 <- rasterPCA(rasters, norm = FALSE)  # without normalization

The above method is not working simply because prcomp does not know how to deal with a raster object. It only knows how to deal with vectors, and coercing to vector does not work, hence the error.

What you need to do is read each of your files into a vector, and put each of the rasters in a column of a matrix. Each row will then be a time series of values at a single spatial location, and each column will be all the pixels at a certain time step. Note that the exact spatial coordinates are not needed in this approach. This matrix serves as the input of prcomp.

Reading the files can be done using readGDAL, and using as.data.frame to cast the spatial data to data.frame.

here is a working solution:

library(raster) 
filename <- system.file("external/rlogo.grd", package="raster")
r1 <- stack(filename) 
pca<-princomp(r1[], cor=T)
res<-predict(pca,r1[])    

Display result:

r2 <- raster(filename) 
r2[]<-res[,1]
plot(r2)

Yet another option would be to extract the vales from the raster-stack, i.e.:

rasters <- stack(my_rasters)
values <- getValues(rasters)
pca <- prcomp(values, scale = TRUE)

Here is another approach that expands on the getValues approach proposed by @Daniel. The result is a raster stack. The index (idx) references non-NA positions so that NA values are accounted for.

library(raster) 
r <- stack(system.file("external/rlogo.grd", package="raster")) 
r.val <- getValues(r)
idx <- which(!is.na(r.val)) 
pca <- princomp(r.val, cor=T)

ncomp <- 2 # first two principle components
r.pca <- r[[1:ncomp]]
  for(i in 1:ncomp) { r.pca[[i]][idx] <- pca$scores[,i] } 

plot(r.pca)
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!