问题
I have a data frame df like this
df = pd.DataFrame([
{'Name': 'Chris', 'Item Purchased': 'Sponge', 'Cost': 22.50},
{'Name': 'Kevyn', 'Item Purchased': 'Kitty Litter', 'Cost': '.........'},
{'Name': 'Filip', 'Item Purchased': 'Spoon', 'Cost': '...'}],
index=['Store 1', 'Store 1', 'Store 2'])
I want to replace the missing values in 'Cost' columns to np.nan
. So far I have tried:
df['Cost']=df['Cost'].str.replace("\.\.+", np.nan)
and
df['Cost']=re.sub('\.\.+',np.nan,df['Cost'])
but neither of them seem to work properly. Please help.
回答1:
Use DataFrame.replace
with the regex=True
switch.
df = df.replace('\.+', np.nan, regex=True)
df
Cost Item Purchased Name
Store 1 22.5 Sponge Chris
Store 1 NaN Kitty Litter Kevyn
Store 2 NaN Spoon Filip
The pattern \.+
specifies one or more dots. You could also use [.]+
as a pattern to the same effect.
来源:https://stackoverflow.com/questions/47131780/replace-dots-in-a-float-column-with-nan-in-python