Listing number of obervations by location

我怕爱的太早我们不能终老 提交于 2019-12-02 11:58:19

Base R and untested code but you should get the idea.

I'm basically testing how many rows fall within the circle equation x2 + y2 <= R for each restaurant, except for that restaurant itself, and updating that as the value in the column. Note that the radius in my equation is 200 but it will be different because your x,y is in latitude, longitude and you will have to scale the radius of 200 metres to 2pi radians / circumference of earth or 360 degree / circumference of earth.

df <- data.frame(
  latitude = runif(n=10,min=0,max=1000),
  longitude = runif(n=10,min=0,max=1000)
  )

for (i in seq(nrow(df)))
{
  # circle's centre
  xcentre <- df[i,'latitude']
  ycentre <- df[i,'longitude']

  # checking how many restaurants lie within 200 m of the above centre, noofcloserest column will contain this value
  df[i,'noofcloserest'] <- sum(
    (df[,'latitude'] - xcentre)^2 + 
      (df[,'longitude'] - ycentre)^2 
    <= 200^2
  ) - 1

  # logging part for deeper analysis
  cat(i,': ')
  # this prints the true/false vector for which row is within the radius, and which row isn't
  cat((df[,'latitude'] - xcentre)^2 + 
    (df[,'longitude'] - ycentre)^2 
  <= 200^2)

  cat('\n')

}

Output -

1 : TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE
2 : FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
3 : FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
4 : TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE
5 : FALSE FALSE TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE
6 : TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE
7 : FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE
8 : FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE
9 : FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
10 : FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE
> df
    latitude longitude noofcloserest
1  189.38878 270.25004             2
2  402.36853 879.26657             0
3  747.46417 581.66627             1
4  291.64303 157.75450             2
5  830.10699 736.19586             2
6  299.06803 157.76147             2
7  725.68360  58.53049             1
8  893.31904 772.46217             1
9   45.47875 701.82201             0
10 645.44772 226.95042             1

What that output means is that for the coordinates at row 1, three rows are within 200 m. Row 1 itself, and rows 4 and 6.

One approach would be to compute the distance matrix, and then to figure out the ones that are sufficiently close (here I demonstrate being within 20 kilometers so the numbers aren't all 0):

# Load the fields library
library(fields)

# Create a simple data frame to demonstrate (each row is a restaurant). The rdist.earth function
# we're about to call takes as input something where the first column is longitude and the second
# column is latitude.
df = data.frame(longitude=c(-111.9269, -111.8983, -112.1863, -112.0739, -112.2766, -112.0692),
                latitude=c(33.46337, 33.62146, 33.65387, 33.44990, 33.56626, 33.48585))

# Let's compute the distance between each restaurant.
distances = rdist.earth(df, miles=F)
distances

#          [,1]     [,2]     [,3]         [,4]     [,5]         [,6]
# [1,]  0.00000 17.79813 32.07533 1.373515e+01 34.41932 1.344867e+01
# [2,] 17.79813  0.00000 26.93558 2.510519e+01 35.61413 2.189270e+01
# [3,] 32.07533 26.93558  0.00000 2.498676e+01 12.85352 2.162964e+01
# [4,] 13.73515 25.10519 24.98676 1.344145e-04 22.84310 4.025824e+00
# [5,] 34.41932 35.61413 12.85352 2.284310e+01  0.00000 2.122719e+01
# [6,] 13.44867 21.89270 21.62964 4.025824e+00 21.22719 9.504539e-05

# Compute the number of restaurants within 20 kilometers of the restaurant in each row.
df$num.close = colSums(distances <= 20) - 1
df$num.close
# [1] 3 1 1 2 1 2
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!