How to find the closest match based on 2 keys from one dataframe to another?

前端 未结 2 1973
一向
一向 2020-12-17 00:05

I have 2 dataframes I\'m working with. One has a bunch of locations and coordinates (longitude, latitude). The other is a weather data set with data from weather stations al

2条回答
  •  無奈伤痛
    2020-12-17 00:38

    Let's say you have a distance function dist that you want to minimize:

    def dist(lat1, long1, lat2, long2):
        return np.abs((lat1-lat2)+(long1-long2))
    

    For a given location, you can find the nearest station as follows:

    lat = 39.463744
    long = -76.119411
    weather.apply(
        lambda row: dist(lat, long, row['Latitude'], row['Longitude']), 
        axis=1)
    

    This will calculate the distance to all weather stations. Using idxmin you can find the closest station name:

    distances = weather.apply(
        lambda row: dist(lat, long, row['Latitude'], row['Longitude']), 
        axis=1)
    weather.loc[distances.idxmin(), 'StationName']
    

    Let's put all this in a function:

    def find_station(lat, long):
        distances = weather.apply(
            lambda row: dist(lat, long, row['Latitude'], row['Longitude']), 
            axis=1)
        return weather.loc[distances.idxmin(), 'StationName']
    

    You can now get all the nearest stations by applying it to the locations dataframe:

    locations.apply(
        lambda row: find_station(row['Latitude'], row['Longitude']), 
        axis=1)
    

    Output:

    0         WALTHAM
    1         WALTHAM
    2    PORTST.LUCIE
    3         WALTHAM
    4    PORTST.LUCIE
    

提交回复
热议问题