How to match multiple columns in pandas DataFrame for an “interval”?

前端 未结 1 1044
南旧
南旧 2021-01-02 10:23

I have the following pandas DataFrame:

import pandas as pd
df = pd.DataFrame(\'filename.csv\')
print(df)

order    start    end    value    
1        1342            


        
相关标签:
1条回答
  • 2021-01-02 11:18

    You can use merge with boolean indexing, but if DataFrames are large, scaling is problematic:

    df1 = pd.merge(df, key_df, on='order', how='outer', suffixes=('','_key'))
    df1 = df1[(df1.start <= df1.start_key) & (df1.end <= df1.end_key)]
    print (df1)
        order  start   end      value  start_key  end_key   value_key
    3       1   1342  1357  category1     1345.0   1392.0  category29
    4       1   1342  1357  category1     1371.0   1383.0  category31
    5       1   1342  1357  category1     1471.0   1501.0  category31
    11      1   1459  1489  category7     1471.0   1501.0  category31
    

    EDIT by comment:

    df1 = pd.merge(df, key_df, on='order', how='outer', suffixes=('','_key'))
    df1 = df1[(df1.start <= df1.start_key) & (df1.end <= df1.end_key)]
    df1 = pd.merge(df, df1, on=['order','start','end', 'value'], how='left')
    print (df1)
       order  start   end       value  start_key  end_key   value_key
    0      1   1342  1357   category1     1345.0   1392.0  category29
    1      1   1342  1357   category1     1371.0   1383.0  category31
    2      1   1342  1357   category1     1471.0   1501.0  category31
    3      1   1459  1489   category7     1471.0   1501.0  category31
    4      1   1572  1601  category23        NaN      NaN         NaN
    5      1   1587  1599   category2        NaN      NaN         NaN
    6      1   1591  1639   category1        NaN      NaN         NaN
    7     15    792   813  category13        NaN      NaN         NaN
    8     15    892   913   category5        NaN      NaN         NaN
    
    0 讨论(0)
提交回复
热议问题