I\'m looking to do the equivalent of the ArcPy Generate Near Table using Geopandas / Shapely. I\'m very new to Geopandas and Shapely and have developed a methodology that wo
I will use two sample datasets in geopandas with different dimensions to demonstrate.
import geopandas as gpd
# read geodata for five nyc boroughs
gdf_nyc = gpd.read_file(gpd.datasets.get_path('nybb'))
# read geodata for international cities
gdf_cities = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))
# convert to a meter projection
gdf_nyc.to_crs(epsg=3857, inplace=True)
gdf_cities.to_crs(epsg=3857, inplace=True)
We can simply apply a lambda function to the GeoSeries. For example, if we want to get the minimal distance between each NYC borough (polygon) and their nearest international city (point). We can do the following:
gdf_nyc.geometry.apply(lambda x: gdf_cities.distance(x).min())
This will give us
0 384422.953323
1 416185.725507
2 412520.308816
3 419511.323677
4 440292.945096
Name: geometry, dtype: float64
Similarly, if we want the minimal distance between each international city and their nearest NYC borough. We can do the following:
gdf_cities.geometry.apply(lambda x: gdf_nyc.distance(x).min())
This will give us
0 9.592104e+06
1 9.601345e+06
2 9.316354e+06
3 8.996945e+06
4 2.614927e+07
...
197 1.177410e+07
198 2.377188e+07
199 8.559704e+06
200 8.902146e+06
201 2.034579e+07
Name: geometry, Length: 202, dtype: float64
Notes:
epsg:3857
, so the distance will be in meters. If you use an ellipsoidal (lon/lat based) projection, the result will be degrees. Converting your projection first before anything else such as getting the centroids of your polygons..distance()
method will make sense when you want to get the distance, let say, between a point and a line. In other words, .distance()
method can calculate distance between any two geo-objects.geometry
columns in a GeoDataFrame, make sure to apply the lambda function to the desired GeoSeries and also call the .distance()
method from the desired GeoSeries. In the example, I called the method from the GeoDataFrame directly because both of them only have one GeoSeries column.