Why does testing `NaN == NaN` not work for dropping from a pandas dataFrame?

风格不统一 提交于 2019-12-05 05:16:00

You should use isnull and notnull to test for NaN (these are more robust using pandas dtypes than numpy), see "values considered missing" in the docs.

Using the Series method dropna on a column won't affect the original dataframe, but do what you want:

In [11]: df
Out[11]:
  comments
0       VP
1       VP
2       VP
3     TEST
4      NaN
5      NaN

In [12]: df.comments.dropna()
Out[12]:
0      VP
1      VP
2      VP
3    TEST
Name: comments, dtype: object

The dropna DataFrame method has a subset argument (to drop rows which have NaNs in specific columns):

In [13]: df.dropna(subset=['comments'])
Out[13]:
  comments
0       VP
1       VP
2       VP
3     TEST

In [14]: df = df.dropna(subset=['comments'])

You need to test NaN with math.isnan() function (Or numpy.isnan). NaNs cannot be checked with the equality operator.

>>> a = float('NaN')
>>> a
nan
>>> a == 'NaN'
False
>>> isnan(a)
True
>>> a == float('NaN')
False

Help Function ->

isnan(...)
    isnan(x) -> bool

    Check if float x is not a number (NaN).
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!