Assign polygon to data point in R dataframe

问题

I have a two data frames:

points contains a series of points with x, y coordinates.
poly contains coordinates of two polygons (I have over 100 in reality, but keeping it simple here).

I want to be able to add to the dataframe points an extra column called Area which contains the name of the polygon the point is in.

poly <- data.frame(
pol= c("P1", "P1","P1","P1","P1","P2","P2","P2","P2", "P2"),
x=c(4360, 7273, 7759, 4440, 4360, 8720,11959, 11440,8200, 8720),
y=c(1009, 9900,28559,28430,1009,9870,9740,28500,28040,9870))

points <- data.frame(
       object = c("P1", "P1","P1","P2","P2","P2"),
       timestamp= c(1485670023468,1485670023970, 1485670024565, 1485670025756,1485670045062, 1485670047366),
       x=c(6000, 6000, 6050, 10000, 10300, 8000),
       y=c(10000, 20000,2000,5000,20000,2000))

plot(poly$x, poly$y, type = 'l')
text(points$x, points$y, labels=points$object )

So essentially in this example the first 2 rows should have Area= "P1" while the last point should be blank as the point is not contained in any polygon.

I have tried using the function in.out but haven't been able to build my data frame as I described.

Any help is very appreciated!

回答1:

Although this is using a for loop, it is practically quite fast.

library(mgcv)

x <- split(poly$x, poly$pol)
y <- split(poly$y, poly$pol)

todo <- 1:nrow(points)
Area <- rep.int("", nrow(points))
pol <- names(x)

# loop through polygons
for (i in 1:length(x)) {
  # the vertices of i-th polygon
  bnd <- cbind(x[[i]], y[[i]])
  # points to allocate
  xy <- with(points, cbind(x[todo], y[todo]))
  inbnd <- in.out(bnd, xy)
  # allocation
  Area[todo[inbnd]] <- pol[i]
  # update 'todo'
  todo <- todo[!inbnd]
  }

points$Area <- Area

Two reasons for its efficiency:

for loop is through the polygons, not points. So if you have 100 polygons and 100000 points to allocate, the loop only has 100 iterations not 100000. Inside each iteration, the vectorization power of C function in.out is exploited;
It works in a progressive way. Once a point has been allocated, it will be excluded from allocation later. todo variable controls the points to allocate through the loop. As it goes, the working set is reducing.

回答2:

You could use the function point.in.polygon from package sp:

points$Area = apply(points, 1, function(p)ifelse(point.in.polygon(p[3], p[4], poly$x[which(poly$pol==p[1])], poly$y[which(poly$pol==p[1])]), p[1], NA))

gives you

  object   timestamp     x     y Area
1     P1 1.48567e+12  6000 10000   P1
2     P1 1.48567e+12  6000 20000   P1
3     P1 1.48567e+12  6050  2000 <NA>
4     P2 1.48567e+12 10000  5000 <NA>
5     P2 1.48567e+12 10300 20000   P2
6     P2 1.48567e+12  8000  2000 <NA>

来源：https://stackoverflow.com/questions/44020974/assign-polygon-to-data-point-in-r-dataframe

标签

polygon

mgcv