How to fill NaT and NaN values separately

℡╲_俬逩灬. 提交于 2020-08-08 04:32:31

问题


My dataframe contains both NaT and NaN values

    Date/Time_entry      Entry      Date/Time_exit       Exit   
0   2015-11-11 10:52:00  19.9900    2015-11-11 11:30:00  20.350 
1   2015-11-11 11:36:00  20.4300    2015-11-11 11:38:00  20.565 
2   2015-11-11 11:44:00  21.0000    NaT                  NaN        
3   2009-04-20 10:28:00  13.7788    2009-04-20 10:46:00  13.700

I want to fill NaT with dates and NaN with numbers. Fillna(4) method replaces both NaT and NaN with 4. Is it possible to differentiate between NaT and NaN somehow?

My current workaround is to df[column].fillna()


回答1:


Since NaTs pertain to datetime columns, you can exclude them when applying your filling operation.

u = df.select_dtypes(exclude=['datetime'])
df[u.columns] = u.fillna(4)
df

      Date/Time_entry    Entry      Date/Time_exit    Exit
0 2015-11-11 10:52:00  19.9900 2015-11-11 11:30:00  20.350
1 2015-11-11 11:36:00  20.4300 2015-11-11 11:38:00  20.565
2 2015-11-11 11:44:00  21.0000                 NaT   4.000
3 2009-04-20 10:28:00  13.7788 2009-04-20 10:46:00  13.700

Similarly, to fill NaT values only, change "exclude" to "include" in the code above.

u = df.select_dtypes(include=['datetime'])
df[u.columns] = u.fillna(pd.to_datetime('today'))
df

      Date/Time_entry    Entry             Date/Time_exit    Exit
0 2015-11-11 10:52:00  19.9900 2015-11-11 11:30:00.000000  20.350
1 2015-11-11 11:36:00  20.4300 2015-11-11 11:38:00.000000  20.565
2 2015-11-11 11:44:00  21.0000 2019-02-17 16:11:09.407466   4.000
3 2009-04-20 10:28:00  13.7788 2009-04-20 10:46:00.000000  13.700



回答2:


Try something like this, using pandas.DataFrame.select_dtypes:

>>> import pandas as pd, datetime, numpy as np
>>> df = pd.DataFrame({'a': [datetime.datetime.now(), np.nan], 'b': [5, np.nan], 'c': [1, 2]})
>>> df
                           a    b  c
0 2019-02-17 18:06:15.231557  5.0  1
1                        NaT  NaN  2
>>> fill_dt = datetime.datetime.now()
>>> fill_value = 4
>>> dt_filled_df = df.select_dtypes('datetime').fillna(fill_dt)
>>> dt_filled_df
                           a
0 2019-02-17 18:06:15.231557
1 2019-02-17 18:06:36.040404
>>> value_filled_df = df.select_dtypes('int').fillna(fill_value)
>>> value_filled_df
   c
0  1
1  2
>>> dt_filled_df.columns = [col + '_notnull' for col in dt_filled_df]
>>> value_filled_df.columns = [col + '_notnull' for col in value_filled_df]
>>> df = df.join(value_filled_df)
>>> df = df.join(dt_filled_df)
>>> df
                           a    b  c  c_notnull                  a_notnull
0 2019-02-17 18:06:15.231557  5.0  1          1 2019-02-17 18:06:15.231557
1                        NaT  NaN  2          2 2019-02-17 18:06:36.040404


来源:https://stackoverflow.com/questions/54738858/how-to-fill-nat-and-nan-values-separately

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!