Pandas Dataframe with NA values throwing ValueError

☆樱花仙子☆ 提交于 2019-12-12 03:18:46

问题


I have a dataframe in pandas that looks like this

df.head(2)
Out[25]: 
                                   CompanyName Region MachineType
recvd_dttm                                                    
2014-07-13 12:40:40     Company1    NA    Machine1
2014-07-13 15:31:39     Company2    NA    Machine2

I am first taking data in a certain date range, then trying to get data that is in the Region NA and is MachineType Machine1.

However, I keep getting this error: ValueError: Length mismatch: Expected axis has 4 elements, new values have 3 elements

This code worked until I added the region column and used this line: df = df[(df['Region']=='NA') & (df['CallType']=='Optia')]

Because at first the data for NA (NorthAmerica) was being read in as NaN, I used keep_default_na=False in my read_csv command.

However, I made a pivot_table this way

result = df.groupby([lambda idx: idx.month, 'CompanyName']).agg(len).reset_index()
result.columns = ['Month', 'CompanyName', 'NumberCalls']

pivot_table = result.pivot(index='Month', columns='CompanyName', values='NumberCalls').fillna(0)

And the error is coming up at the result.columns line, though I wouldn't be surprised if perhaps the fillna(0) command is acting up, as there were other NA values that were actually supposed to be NaN , not NorthAmerica.

How do I fix the ValueError and avoid NA confusion?


回答1:


Here's what you can do to replace the NaN in one column only:

import pandas as pd
import numpy as np

df = pd.read_clipboard()
print df

# I created a test column
           recvd_dttm CompanyName  Region MachineType  Test
2014-07-13   12:40:40    Company1     NaN    Machine1   NaN
2014-07-13   15:31:39    Company2     NaN    Machine2   NaN

df['Region'] = df['Region'].replace(np.NaN, 'NorthAm')
print df

           recvd_dttm CompanyName   Region MachineType  Test
2014-07-13   12:40:40    Company1  NorthAm    Machine1   NaN
2014-07-13   15:31:39    Company2  NorthAm    Machine2   NaN


来源:https://stackoverflow.com/questions/31543959/pandas-dataframe-with-na-values-throwing-valueerror

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!