Write an efficient loop to compare gps coordinates

China☆狼群 提交于 2020-08-08 07:11:11

问题


I want to go through a dataframe of GPS coordinates and remove all coordinates that are to close to each other.

pick first row
  clalulate the distance between selected and the next row
  if the distance is < mindist and current row is not the last row continue to next row
  else select the current row (leave it in dataframe) and if the selected row is not the last row
   repeat from the begining

The result should be a dataframe with gps points that are at least mindist away from each other

One aproach was:

 node_distances <- function(node_coords)
  {
  n <- nrow(node_coords)
  from <- 1:(n - 1)
  to <- 2:n
  return(c(0, geodist::geodist_vec(node_coords[from, ]$lon,node_coords[from, ]$lat, node_coords[to, ]$lon, node_coords[to, ]$lat, paired = TRUE, measure = "geodesic")))
}
distances %>% filter(dist < mindist)

But this aproach only tests 2 rows so that means it creates big gaps in the file.

I started writig nested loops but his is bad decision that does not work and is slow:

node_distances_hack <- function(node_coords)
{
  n <- nrow(node_coords)
  for(i in 1:n) {
    print(node_coords[i,])
    a<-i
    distance_c<-0
    mindist<-50
    while(distance_c<mindist || a >= n){
      distance_c<-geodist::geodist_vec(node_coords[i,]$lat,node_coords[i,]$lon,node_coords[a,]$lat,node_coords[a,]$lon, paired = TRUE, measure = "cheap")
      a<-a+1
      }
  }
}

What is the better approach? Thank you in advance, BR


回答1:


You can do this without a loop at all by using geodist:::geodist_xy_vec to get the distances between each pair of points, since this generates a pairwise matrix. Consider this function:

remove_close <- function(df, CLOSE = 10000)
{
  dist_mat <- geodist:::geodist_xy_vec(df$lon, df$lat, df$lon, df$lat, "cheap")
  diag(dist_mat) <- CLOSE + 1
  clashes <- which(dist_mat < CLOSE, arr.ind = TRUE)
  duplicates <- unique(t(apply(clashes, 1, sort)))[, 2]
  df[-duplicates, ]
}

library(ggplot2)

set.seed(69)

df <- data.frame(lat  = runif(1000, 51, 54),
                 lon = runif(1000, 8, 13))

ggplot(df, aes(lon, lat)) + geom_point()


ggplot(remove_close(df), aes(lon, lat)) + geom_point()

Created on 2020-07-22 by the reprex package (v0.3.0)




回答2:


A method, using the df provided by @Allan Cameron, would use fuzzyjoin. First, you would identify the locations that are close to each other. You would then remove them from the data frame. The example I provide uses a 1 km distance.

library(dplyr)
library(fuzzyjoin)

df <- data.frame(latitude  = runif(1000, 51, 54),
             longitude = runif(1000, 8, 13))


close <- df %>% fuzzyjoin::geo_left_join(df, max_dist = 1, unit = "km") %>% 
  filter((longitude.x == longitude.y & latitude.x == latitude.y) == FALSE) %>% 
  rename(longitude = longitude.x, latitude = latitude.x) %>% 
  select(longitude, latitude)


df %>% 
  anti_join(close)


来源:https://stackoverflow.com/questions/63034689/write-an-efficient-loop-to-compare-gps-coordinates

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!