Ignoring NaNs with str.contains

后端 未结 6 519
花落未央
花落未央 2020-11-27 11:23

I want to find rows that contain a string, like so:

DF[DF.col.str.contains(\"foo\")]

However, this fails because some elements are NaN:

6条回答
  •  悲&欢浪女
    2020-11-27 11:46

    There's a flag for that:

    In [11]: df = pd.DataFrame([["foo1"], ["foo2"], ["bar"], [np.nan]], columns=['a'])
    
    In [12]: df.a.str.contains("foo")
    Out[12]:
    0     True
    1     True
    2    False
    3      NaN
    Name: a, dtype: object
    
    In [13]: df.a.str.contains("foo", na=False)
    Out[13]:
    0     True
    1     True
    2    False
    3    False
    Name: a, dtype: bool
    

    See the str.replace docs:

    na : default NaN, fill value for missing values.


    So you can do the following:

    In [21]: df.loc[df.a.str.contains("foo", na=False)]
    Out[21]:
          a
    0  foo1
    1  foo2
    

提交回复
热议问题