Fastest way to merge pandas dataframe on ranges

后端 未结 3 937
小蘑菇
小蘑菇 2020-12-09 12:25

I have a dataframe A

    ip_address
0   13
1   5
2   20
3   11
.. ........

and another dataframe B



        
3条回答
  •  感动是毒
    2020-12-09 12:54

    IntervalIndex is as of pandas 0.20.0 and the solution by @JohnGalt using it is excellent.

    Prior to that version, this solution would work which expands the ip addresses by country for the complete range.

    df_ip = pd.concat([pd.DataFrame(
        {'ip_address': range(row['lowerbound_ip_address'], row['upperbound_ip_address'] + 1), 
         'country': row['country']}) 
        for _, row in dfb.iterrows()]).set_index('ip_address')
    >>> dfa.set_index('ip_address').join(df_ip)
                  country
    ip_address           
    13              China
    5           Australia
    20              China
    11              China
    

提交回复
热议问题