Python Pandas replace NaN in one column with value from corresponding row of second column

无人久伴 提交于 2019-12-18 10:09:01

问题


I am working with this Pandas DataFrame in Python 2.7.

File    heat    Farheit Temp_Rating
   1    YesQ    75      N/A
   1    NoR     115     N/A
   1    YesA    63      N/A
   1    NoT     83      41
   1    NoY     100     80
   1    YesZ    56      12
   2    YesQ    111     N/A
   2    NoR     60      N/A
   2    YesA    19      N/A
   2    NoT     106     77
   2    NoY     45      21
   2    YesZ    40      54
   3    YesQ    84      N/A
   3    NoR     67      N/A
   3    YesA    94      N/A
   3    NoT     68      39
   3    NoY     63      46
   3    YesZ    34      81

I need to replace all NaNs in the Temp_Rating column with the value from the Farheit column.

This is what I need:

File        heat    Observation
   1        YesQ    75
   1        NoR     115
   1        YesA    63
   1        YesQ    41
   1        NoR     80
   1        YesA    12
   2        YesQ    111
   2        NoR     60
   2        YesA    19
   2        NoT     77
   2        NoY     21
   2        YesZ    54
   3        YesQ    84
   3        NoR     67
   3        YesA    94
   3        NoT     39
   3        NoY     46
   3        YesZ    81

If I do a Boolean selection, I can pick out only one of these columns at a time. The problem is if I then try to join them, I am not able to do this while preserving the correct order.

How can I only find Temp_Rating rows with the NaNs and replace them with the value in the same row of the Farheit column?


回答1:


Assuming your DataFrame is in df:

df.Temp_Rating.fillna(df.Farheit, inplace=True)
del df['Farheit']
df.columns = 'File heat Observations'.split()

First replace any NaN values with the corresponding value of df.Farheit. Delete the 'Farheit' column. Then rename the columns. Here's the resulting DataFrame:




回答2:


The above mentioned solutions did not work for me. The method I used was:

df.loc[df['foo'].isnull(),'foo'] = df['bar']



回答3:


An other way to solve this problem,

import pandas as pd
import numpy as np

ts_df = pd.DataFrame([[1,"YesQ",75,],[1,"NoR",115,],[1,"NoT",63,13],[2,"YesT",43,71]],columns=['File','heat','Farheit','Temp'])


def fx(x):
    if np.isnan(x['Temp']):
        return x['Farheit']
    else:
        return x['Temp']
print(1,ts_df)
ts_df['Temp']=ts_df.apply(lambda x : fx(x),axis=1)

print(2,ts_df)

returns:

(1,    File  heat  Farheit  Temp                                                                                    
0     1  YesQ       75   NaN                                                                                        
1     1   NoR      115   NaN                                                                                        
2     1   NoT       63  13.0                                                                                        
3     2  YesT       43  71.0)                                                                                       
(2,    File  heat  Farheit   Temp                                                                                   
0     1  YesQ       75   75.0                                                                                       
1     1   NoR      115  115.0
2     1   NoT       63   13.0
3     2  YesT       43   71.0)


来源:https://stackoverflow.com/questions/29177498/python-pandas-replace-nan-in-one-column-with-value-from-corresponding-row-of-sec

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!