Error: The truth value of a Series is ambiguous - Python pandas

后端 未结 3 1586
时光取名叫无心
时光取名叫无心 2020-12-06 17:58

I know this question has been asked before, however, when I am trying to do an if statement and I am getting an error. I looked at this link , but did not help

相关标签:
3条回答
  • 2020-12-06 18:18

    Here is a small demo, which shows why this is happenning:

    In [131]: df = pd.DataFrame(np.random.randint(0,20,(5,2)), columns=list('AB'))
    
    In [132]: df
    Out[132]:
        A   B
    0   3  11
    1   0  16
    2  16   1
    3   2  11
    4  18  15
    
    In [133]: res = df['A'] > 10
    
    In [134]: res
    Out[134]:
    0    False
    1    False
    2     True
    3    False
    4     True
    Name: A, dtype: bool
    

    when we try to check whether such Series is True - Pandas doesn't know what to do:

    In [135]: if res:
         ...:     print(df)
         ...:
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    ...
    skipped
    ...
    ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
    

    Workarounds:

    we can decide how to treat Series of boolean values - for example if should return True if all values are True:

    In [136]: res.all()
    Out[136]: False
    

    or when at least one value is True:

    In [137]: res.any()
    Out[137]: True
    
    In [138]: if res.any():
         ...:     print(df)
         ...:
        A   B
    0   3  11
    1   0  16
    2  16   1
    3   2  11
    4  18  15
    
    0 讨论(0)
  • 2020-12-06 18:22

    the comparison returns a range of values, you need to limit it either by any() or all(), for example,

         if((df[col] == ' this is any string or list').any()):
           return(df.loc[df[col] == temp].index.values.astype(int)[0])
    
    0 讨论(0)
  • 2020-12-06 18:27

    Currently, you're selecting the entire series for comparison. To get an individual value from the series, you'll want to use something along the lines of:

    for i in dfs:
    if (i['var1'].iloc[0] < 3.000):
       print(i)
    

    To compare each of the individual elements you can use series.iteritems (documentation is sparse on this one) like so:

    for i in dfs:
        for _, v in i['var1'].iteritems():
            if v < 3.000:
                print(v)
    

    The better solution here for most cases is to select a subset of the dataframe to use for whatever you need, like so:

    for i in dfs:
        subset = i[i['var1'] < 3.000]
        # do something with the subset
    

    Performance in pandas is much faster on large dataframes when using series operations instead of iterating over individual values. For more detail, you can check out the pandas documentation on selection.

    0 讨论(0)
提交回复
热议问题