Find empty or NaN entry in Pandas Dataframe

前端 未结 8 2452
不思量自难忘°
不思量自难忘° 2020-12-05 04:17

I am trying to search through a Pandas Dataframe to find where it has a missing entry or a NaN entry.

Here is a dataframe that I am working with:

cl_         


        
相关标签:
8条回答
  • 2020-12-05 05:07

    Check if the columns contain Nan using .isnull() and check for empty strings using .eq(''), then join the two together using the bitwise OR operator |.

    Sum along axis 0 to find columns with missing data, then sum along axis 1 to the index locations for rows with missing data.

    missing_cols, missing_rows = (
        (df2.isnull().sum(x) | df2.eq('').sum(x))
        .loc[lambda x: x.gt(0)].index
        for x in (0, 1)
    )
    
    >>> df2.loc[missing_rows, missing_cols]
             A2       A3
    2            1.10035
    5 -0.508501         
    6       NaN      NaN
    7       NaN      NaN
    
    0 讨论(0)
  • 2020-12-05 05:08

    np.where(pd.isnull(df)) returns the row and column indices where the value is NaN:

    In [152]: import numpy as np
    In [153]: import pandas as pd
    In [154]: np.where(pd.isnull(df))
    Out[154]: (array([2, 5, 6, 6, 7, 7]), array([7, 7, 6, 7, 6, 7]))
    
    In [155]: df.iloc[2,7]
    Out[155]: nan
    
    In [160]: [df.iloc[i,j] for i,j in zip(*np.where(pd.isnull(df)))]
    Out[160]: [nan, nan, nan, nan, nan, nan]
    

    Finding values which are empty strings could be done with applymap:

    In [182]: np.where(df.applymap(lambda x: x == ''))
    Out[182]: (array([5]), array([7]))
    

    Note that using applymap requires calling a Python function once for each cell of the DataFrame. That could be slow for a large DataFrame, so it would be better if you could arrange for all the blank cells to contain NaN instead so you could use pd.isnull.

    0 讨论(0)
提交回复
热议问题