Check if column value is in other columns in pandas

后端 未结 5 1731
终归单人心
终归单人心 2020-12-17 18:01

I have the following dataframe in pandas

  target   A       B      C
0 cat      bridge  cat    brush  
1 brush    dog     cat    shoe
2 bridge   cat     shoe         


        
相关标签:
5条回答
  • 2020-12-17 18:16

    you can use apply a function for each row that counts the number of value that match the value in the 'target' column:

    df["exist"] = df.apply(lambda row:row.value_counts()[row['target']] > 1 , axis=1)
    

    for a dataframe that looks like:

       b  c target
    0  3  a      a
    1  3  4      2
    2  3  4      2
    3  3  4      2
    4  3  4      4
    

    the output will be:

       b  c target  exist
    0  3  a      a   True
    1  3  4      2  False
    2  3  4      2  False
    3  3  4      2  False
    4  3  4      4   True
    
    0 讨论(0)
  • 2020-12-17 18:21

    OneHotEncoder approach:

    In [165]: x = pd.get_dummies(df.drop('target',1), prefix='', prefix_sep='')
    
    In [166]: x
    Out[166]:
       bridge  cat  dog  cat  shoe  bridge  brush  shoe
    0       1    0    0    1     0       0      1     0
    1       0    0    1    1     0       0      0     1
    2       0    1    0    0     1       1      0     0
    
    In [167]: x[df['target']].eq(1).any(1)
    Out[167]:
    0    True
    1    True
    2    True
    dtype: bool
    

    Explanation:

    In [168]: x[df['target']]
    Out[168]:
       cat  cat  brush  bridge  bridge
    0    0    1      1       1       0
    1    0    1      0       0       0
    2    1    0      0       0       1
    
    0 讨论(0)
  • 2020-12-17 18:23

    Another approach using index difference method:

    matches = df[df.columns.difference(['target'])].eq(df['target'], axis = 0)
    
    #       A      B      C
    #0  False   True  False
    #1  False  False  False
    #2  False  False   True
    
    # Check if at least one match:
    matches.any(axis = 1)
    
    #Out[30]: 
    #0     True
    #1    False
    #2     True
    

    In case you wanted to see which columns meet the target, here is a possible solution:

    matches.apply(lambda x: ", ".join(x.index[np.where(x.tolist())]), axis = 1)
    
    Out[53]: 
    0    B
    1     
    2    C
    dtype: object
    
    0 讨论(0)
  • 2020-12-17 18:26

    You can use drop, isin and any.

    • drop the target column to have a df with your A, B, C columns only
    • check if the values isin the target column
    • and check if any hits are present

    That's it.

    df["exists"] = df.drop("target", 1).isin(df["target"]).any(1)
    print(df)
    
        target  A       B       C       exists
    0   cat     bridge  cat     brush   True
    1   brush   dog     cat     shoe    False
    2   bridge  cat     shoe    bridge  True
    
    0 讨论(0)
  • 2020-12-17 18:30

    You can use eq, for drop column pop if neech check by rows:

    mask = df.eq(df.pop('target'), axis=0)
    print (mask)
           A      B      C
    0  False   True  False
    1  False  False  False
    2  False  False   True
    

    And then if need check at least one True add any:

    mask = df.eq(df.pop('target'), axis=0).any(axis=1)
    print (mask)
    0     True
    1    False
    2     True
    dtype: bool
    
    df['new'] = df.eq(df.pop('target'), axis=0).any(axis=1)
    print (df)
            A     B       C    new
    0  bridge   cat   brush   True
    1     dog   cat    shoe  False
    2     cat  shoe  bridge   True
    

    But if need check all values in column use isin:

    mask = df.isin(df.pop('target').values.tolist())
    print (mask)
           A      B      C
    0   True   True   True
    1  False   True  False
    2   True  False   True
    

    And if want check if all values are True add all:

    df['new'] = df.isin(df.pop('target').values.tolist()).all(axis=1)
    print (df)
            A     B       C    new
    0  bridge   cat   brush   True
    1     dog   cat    shoe  False
    2     cat  shoe  bridge  False
    
    0 讨论(0)
提交回复
热议问题