问题
I have a pandas dataframe (that was created by importing a csv file). I want to replace blank values with NaN. Some of these blank values are empty and some contain a (variable number) of spaces ''
, ' '
, ' '
, etc.
Using the suggestion from this thread I have
df.replace(r'\s+', np.nan, regex=True, inplace = True)
which does replace all the strings that only contain spaces, but also replaces every string that has a space in it, which is not what I want.
How do I replace only strings with just spaces and empty strings?
回答1:
If you are reading a csv
file and want to convert all empty strings to nan
while reading the file itself then you can use the option
skipinitialspace=True
Example code
pd.read_csv('Sample.csv', skipinitialspace=True)
This will remove any white spaces that appear after the delimiters, Thus making all the empty strings as nan
From the documentation http://pandas.pydata.org/pandas-docs/stable/io.html
Note: This option will remove preceding white spaces even from valid data, if for any reason you want to retain the preceding white space then this option is not a good choice.
回答2:
Indicate it has to start with blank and end with blanks with ^ and $ :
df.replace(r'^\s*$', np.nan, regex=True, inplace = True)
来源:https://stackoverflow.com/questions/40711900/replacing-empty-strings-with-nan-in-pandas