Pandas: drop columns with all NaN's

こ雲淡風輕ζ 提交于 2019-12-30 04:37:08

问题


I realize that dropping NaNs from a dataframe is as easy as df.dropna but for some reason that isn't working on mine and I'm not sure why.

Here is my original dataframe:

fish_frame1:                       0   1   2         3   4       5   6          7
0               #0915-8 NaN NaN       NaN NaN     NaN NaN        NaN
1                   NaN NaN NaN  LIVE WGT NaN  AMOUNT NaN      TOTAL
2               GBW COD NaN NaN     2,280 NaN   $0.60 NaN  $1,368.00
3               POLLOCK NaN NaN     1,611 NaN   $0.01 NaN     $16.11
4                 WHAKE NaN NaN       441 NaN   $0.70 NaN    $308.70
5           GBE HADDOCK NaN NaN     2,788 NaN   $0.01 NaN     $27.88
6           GBW HADDOCK NaN NaN    16,667 NaN   $0.01 NaN    $166.67
7               REDFISH NaN NaN       932 NaN   $0.01 NaN      $9.32
8    GB WINTER FLOUNDER NaN NaN       145 NaN   $0.25 NaN     $36.25
9   GOM WINTER FLOUNDER NaN NaN    25,070 NaN   $0.35 NaN  $8,774.50
10        GB YELLOWTAIL NaN NaN        26 NaN   $1.75 NaN     $45.50

The code that follows is an attempt to drop all NaNs as well as any columns with more than 3 NaNs (either one, or both, should work I think):

fish_frame.dropna()
fish_frame.dropna(thresh=len(fish_frame) - 3, axis=1)

This produces:

fish_frame1 after dropna:                       0   1   2         3   4       5   6          7
0               #0915-8 NaN NaN       NaN NaN     NaN NaN        NaN
1                   NaN NaN NaN  LIVE WGT NaN  AMOUNT NaN      TOTAL
2               GBW COD NaN NaN     2,280 NaN   $0.60 NaN  $1,368.00
3               POLLOCK NaN NaN     1,611 NaN   $0.01 NaN     $16.11
4                 WHAKE NaN NaN       441 NaN   $0.70 NaN    $308.70
5           GBE HADDOCK NaN NaN     2,788 NaN   $0.01 NaN     $27.88
6           GBW HADDOCK NaN NaN    16,667 NaN   $0.01 NaN    $166.67
7               REDFISH NaN NaN       932 NaN   $0.01 NaN      $9.32
8    GB WINTER FLOUNDER NaN NaN       145 NaN   $0.25 NaN     $36.25
9   GOM WINTER FLOUNDER NaN NaN    25,070 NaN   $0.35 NaN  $8,774.50
10        GB YELLOWTAIL NaN NaN        26 NaN   $1.75 NaN     $45.50

I am a novice with Pandas so I'm not sure if this isn't working because I'm doing something wrong or I'm misunderstanding something or misusing a command. Any help is appreciated thanks.


回答1:


From the dropna docstring:

    # drop the columns where all elements are NaN:

    >>> df.dropna(axis=1, how='all')
         A    B  D
    0  NaN  2.0  0
    1  3.0  4.0  1
    2  NaN  NaN  5



回答2:


dropna() drops the null values and returns a dataFrame. Assign it back to the original dataFrame.

fish_frame = fish_frame.dropna(axis = 1, how = 'all')

Referring to your code:

fish_frame.dropna(thresh=len(fish_frame) - 3, axis=1)

This would drop columns with 7 or more NaN's (assuming len(df) = 10), if you want to drop columns with more than 3 Nan's like you've mentioned, thresh should be equal to 3.




回答3:


dropna() by default returns a dataframe (defaults to inplace=False behavior) and thus needs to be assigned to a new dataframe for it to stay in your code.

So for example,

fish_frame = fish_frame.dropna()

As to why your dropna is returning an empty dataframe, I'd recommend you look at the "how" argument in the dropna method (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html). Also bear in mind, axis=0 corresponds to columns, and axis=1 corresponds to rows.

So to remove columns with all "NAs", axis=0, how="any" should do the trick:

fish_frame = fish_frame.dropna(axis=0, how="any")

Finally, the "thresh" argument designates explicitly how many NA's are necessary for a drop to occur. So

fish_frame = fish_frame.dropna(axis=0, thresh=3, how="any") 

should work fine and dandy to remove any column with three NA's.

Also, as Corley pointed out, how="any" is the default and is thus not necessary.



来源:https://stackoverflow.com/questions/45147100/pandas-drop-columns-with-all-nans

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!