问题
I'm new to geospatial stats and can't figure out a simple question:
I have two datasets with spatial coordinates. One has coordinates of hospitals and clinics in a particular district. The other has coordinates of all households in that district.
Here's some mock data
hospital_coord <-data.frame(longitude = c(80.15998, 72.89125, 77.65032, 77.60599),
latitude = c(12.90524, 19.08120, 12.97238, 12.90927))
people_coord <-data.frame(longitude = c(72.89537, 77.65094, 73.95325, 72.96746,
77.65058, 77.66715, 77.64214, 77.58415,
77.76180, 76.65470, 76.65480, 76.65490, 76.65500, 76.65560, 76.65560),
latitude = c(19.07726, 13.03902, 18.50330, 19.16764,
12.90871, 13.01693, 13.00954, 12.92079,
13.02212, 12.81447, 12.81457, 12.81467, 12.81477, 12.81487, 12.81497))
I would like to calculate the following:
- What percentage of households live more than 2 kilometres from the nearest clinic/hospital
- Create a column in the dataframe indicating which households are within or outside the 2km distance
回答1:
I think this does what you want, using the more recent sf
package rather than geosphere
from the question linked. The approach is as follows:
- Convert the latitude/longitude points into geometry objects using
st_as_sf
- Set the coordinate reference system to a standard long/lat one since the data is in long/lat (this is WGS84)
- Use
st_distance
to compute the distance between each person and each hospital as aunits
table, in metres. - Convert that
units
table into a regulartbl
because it is a pain to deal with, and check which pairs have more than 2km separation - Use
mutate_at
to check each row to see whether each hospital is less than 2km away (T
) or more than 2km away (F
) - Finally, use
pmap
andany
to check each row and see if at least one hospital is within 2km!
It looks like only the first patient is within 2km of a hospital.
library(tidyverse)
library(sf)
hospital <- tibble(
longitude = c(80.15998, 72.89125, 77.65032, 77.60599),
latitude = c(12.90524, 19.08120, 12.97238, 12.90927)
)
people <- tibble(
longitude = c(72.89537, 77.65094, 73.95325, 72.96746, 77.65058,
77.66715, 77.64214, 77.58415, 77.76180, 76.65470,
76.65480, 76.65490, 76.65500, 76.65560, 76.65560),
latitude = c(19.07726, 13.03902, 18.50330, 19.16764, 12.90871,
13.01693, 13.00954, 12.92079, 13.02212, 12.81447,
12.81457, 12.81467, 12.81477, 12.81487, 12.81497)
)
hospital_sf <- hospital %>%
st_as_sf(coords = c("longitude", "latitude")) %>%
st_set_crs(4326)
people_sf <- people %>%
st_as_sf(coords = c("longitude", "latitude")) %>%
st_set_crs(4326)
distances <- st_distance(people_sf, hospital_sf) %>%
as_tibble() %>%
mutate_at(vars(V1:V4), as.numeric) %>%
mutate_at(vars(V1:V4), function (x) x > 2000) %>%
mutate(within_2km = pmap_lgl(., function(V1, V2, V3, V4) any(V1, V2, V3, V4)))
# A tibble: 15 x 5
V1 V2 V3 V4 within_2km
<lgl> <lgl> <lgl> <lgl> <lgl>
1 T F T T T
2 T T T T F
3 T T T T F
4 T T T T F
5 T T T T F
6 T T T T F
7 T T T T F
8 T T T T F
9 T T T T F
10 T T T T F
11 T T T T F
12 T T T T F
13 T T T T F
14 T T T T F
15 T T T T F
来源:https://stackoverflow.com/questions/48681930/calculating-the-number-of-people-who-live-within-or-outside-a-certain-distance-f