spatial filtering by proximity in R

↘锁芯ラ 提交于 2019-12-03 21:16:32

Following Josh O'Brien's advice, I looked at spatstat's rMaternI function, and came up with the following. It seems to work pretty well.

The distance is in map units. It would be nice to incorporate one of R's distance functions that always returns distances in meters, rather than input units, but I couldn't figure that out...

require(spatstat)
require(maptools)
occ <- readShapeSpatial('occurrence_example.shp')

filterByProximity <- function(occ, dist) {
    pts <- as.ppp.SpatialPoints(occ)
    d <- nndist(pts)
    z <- which(d > dist)
    return(occ[z,])
}

occ2 <- filterByProximity(occ,dist=0.2)
plot(occ)
plot(occ2,add=T,col='blue',pch=20)

I've written a new version of this function that no longer really follows rMaternII. The input can either be a SpatialPoints, SpatialPointsDataFrame or matrix object.

Seems to work well, but suggestions welcome!

filterByProximity <- function(xy, dist, mapUnits = F) {
    #xy can be either a SpatialPoints or SPDF object, or a matrix
    #dist is in km if mapUnits=F, in mapUnits otherwise
    if (!mapUnits) {
        d <- spDists(xy,longlat=T)
    }
    if (mapUnits) {
        d <- spDists(xy,longlat=F)
    }
    diag(d) <- NA
    close <- (d <= dist)
    diag(close) <- NA
    closePts <- which(close,arr.ind=T)
    discard <- matrix(nrow=2,ncol=2)
    if (nrow(closePts) > 0) {
            while (nrow(closePts) > 0) {
                if ((!paste(closePts[1,1],closePts[1,2],sep='_') %in% paste(discard[,1],discard[,2],sep='_')) & (!paste(closePts[1,2],closePts[1,1],sep='_') %in% paste(discard[,1],discard[,2],sep='_'))) {
                discard <- rbind(discard, closePts[1,])
                closePts <- closePts[-union(which(closePts[,1] == closePts[1,1]), which(closePts[,2] == closePts[1,1])),]
                }
            }
        discard <- discard[complete.cases(discard),]
        return(xy[-discard[,1],])
    }
    if (nrow(closePts) == 0) {
        return(xy)
    }
}

Let's test it:

require(rgeos)
require(sp)
pts <- readWKT("MULTIPOINT ((3.5 2), (1 1), (2 2), (4.5 3), (4.5 4.5), (5 5), (1 5))")

pts2 <- filterByProximity(pts,dist=2, mapUnits=T)

plot(pts)
axis(1)
axis(2)
apply(as.data.frame(pts),1,function(x) plot(gBuffer(SpatialPoints(coords=matrix(c(x[1],x[2]),nrow=1)),width=2),add=T))
plot(pts2,add=T,col='blue',pch=20,cex=2)

There is also an R package called spThin that performs spatial thinning on point data. It was developed for reducing the effects of sampling bias for species distribution models, and does multiple iterations for optimization. The function is quite easy to implement---the vignette can be found here. There is also a paper in Ecography with details about the technique.

Rather than removing data points, you might consider spatial declustering. This involves giving points in clusters a lower weight than outlying points. The two simplest ways to do this involve a polygonal segmentation, like a Voronoi diagram, or some arbitrary grid. Both methods will weight points in each region according to the area of the region.

For example, if we take the points in your test (1,1),(2,2),(4.5,4.5),(5,5),(1,5) and apply a regular 2-by-2 mesh, where each cell is three units on a side, then the five points fall into three cells. The points ((1,1),(2,2)) falling into the cell [0,3]X[0,3] would each have weights 1/( no. of points in current cell TIMES tot. no. of occupied cells ) = 1 / ( 2 * 3 ). The same thing goes for the points ((4.5,4.5),(5,5)) in the cell (3,6]X(3,6]. The "outlier", (1,5) would have a weight 1 / ( 1 * 3 ). The nice thing about this technique is that it is a quick way to generate a density based weighting scheme.

A polygonal segmentation involves drawing a polygon around each point and using the area of that polygon to calculate the weight. Generally, the polygons completely cover the entire region, and the weights are calculated as the inverse of the area of each polygon. A Voronoi diagram is usually used for this, but polygonal segmentations may be calculated using other techniques, or may be specified by hand.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!