Calculating the distance between points in different data frames

前端 未结 2 454
心在旅途
心在旅途 2020-12-11 09:52

I am trying to find the distance between points in two different data frames given that they have the same value in one of their columns.

I figure the first step is

2条回答
  •  一个人的身影
    2020-12-11 10:34

    Without a reproducible example, all I can do is offer you a general solution.

    I like data.table and the syntax here will look very simple. Check out the Getting Started vignettes for more on the package.

    I'm going to create two data.tables that match your general description first:

    library(data.table)
    set.seed(1734)
    A<-data.table(Name=1:10,x=rnorm(10),key="Name")
    B<-data.table(Name=1:10,y=rnorm(10),key="Name")
    

    Now, we want to merge A and B by Name (to merge, we need a key set, which I've conveniently done already), then use the respective x and y coordinates to calculate (Euclidean) distance. To do so is simple:

    A[B,distance:=sqrt(x^2+y^2)]
    

    The distance you seek is now stored in the data.table A under the column distance. If you don't want to store the distance, and just want the output, you could do: A[B,sqrt(x^2+y^2)].

    To start from scratch if A and B are already stored as data.frames, it's not much more complicated:

    setDT(A,key="Name")[setDT(B,key="Name"),distance:=sqrt(x^2+y^2)]
    

    We've used the convenient setDT function to convert A and B (in-line) to a data.table by reference, simultaneously declaring the key to be Name for both*.

    *It may not be strictly necessary to set the key of B, but I think it is good practice to do so. Also, the key option of setDT is only currently available in the development version of data.table (1.9.5+); with the CRAN version, use setkey(setDT(A),Name), etc.

提交回复
热议问题