AttributeError: 'float' object has no attribute 'split'

二次信任 提交于 2019-12-31 22:01:08

问题


I am calling this line:

lang_modifiers = [keyw.strip() for keyw in row["language_modifiers"].split("|") if not isinstance(row["language_modifiers"], float)]

This seems to work where row["language_modifiers"] is a word (atlas method, central), but not when it comes up as nan.

I thought my if not isinstance(row["language_modifiers"], float) could catch the time when things come up as nan but not the case.

Background: row["language_modifiers"] is a cell in a tsv file, and comes up as nan when that cell was empty in the tsv being parsed.


回答1:


You are right, such errors mostly caused by NaN representing empty cells. It is common to filter out such data, before applying your further operations, using this idiom on your dataframe df:

df_new = df[df['ColumnName'].notnull()]

Alternatively, it may be more handy to use fillna() method to impute (to replace) null values with something default. E.g. all null or NaN's can be replaced with the average value for its column

housing['LotArea'] = housing['LotArea'].fillna(housing.mean()['LotArea'])

or can be replaced with a value like empty string "" or another default value

housing['GarageCond']=housing['GarageCond'].fillna("")



回答2:


You might also use df = df.dropna(thresh=n) where n is the tolerance. Meaning, it requires n Non-NA values to not drop the row

Mind you, this approach will remove the row

For example: If you have a dataframe with 5 columns, df.dropna(thresh=5) would drop any row that does not have 5 valid, or non-Na values.

In your case you might only want to keep valid rows; if so, you can set the threshold to the number of columns you have.

pandas documentation on dropna



来源:https://stackoverflow.com/questions/42224700/attributeerror-float-object-has-no-attribute-split

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!