Difference between str() and astype(str)?

↘锁芯ラ 提交于 2019-12-11 05:15:06

问题


I want to save the dataframe df to the .h5 file MainDataFile.h5 :

df.to_hdf ("c:/Temp/MainDataFile.h5", "MainData", mode = "w", format = "table", data_columns=['_FirstDayOfPeriod','Category','ChannelId'])

and get the following error :

*** Exception: cannot find the correct atom type -> > [dtype->object,items->Index(['Libellé_Article', 'Libellé_segment'], dtype='object')]

If I modifify the column 'Libellé_Article' in this way :

df['Libellé_Article'] = str(df['Libellé_Article'])

there is no error anymore, whereas I still get the error message when doing :

df['Libellé_Article'] = df['Libellé_Article'].astype(str)

The problem is that using str() is blowing up my ram.

Any idea ?


回答1:


str(df['Libellé_Article']) will convert the contents of the entire column in to single string. It will end up with a very big string. And thats the reason for blowing up your RAM

For example

>> df = pd.DataFrame([1,2,3], columns=['A'])
>> df['A']
0    1
1    2
2    3 
Name: A, dtype: int64

>> str(df['A'])
 '0    1\n1    2\n2    3\nName: A, dtype: int64'
>> df['A'].astype(str)
0    1
1    2
2    3
Name: A, dtype: object

So you should use .astype(str) only, if you want to convert your entire column to type string



来源:https://stackoverflow.com/questions/30095172/difference-between-str-and-astypestr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!