问题
Currently I am using the following code to make replacements which is a little cumbersome:
df1['CompanyA'] = df1['CompanyA'].str.replace('.','')
df1['CompanyA'] = df1['CompanyA'].str.replace('-','')
df1['CompanyA'] = df1['CompanyA'].str.replace(',','')
df1['CompanyA'] = df1['CompanyA'].str.replace('ltd','limited')
df1['CompanyA'] = df1['CompanyA'].str.replace('&','and')
df1['Address1A'] = df1['Address1A'].str.replace('.','')
df1['Address1A'] = df1['Address1A'].str.replace('-','')
df1['Address1A'] = df1['Address1A'].str.replace('&','and')
df1['Address1A'].str.replace(r'\brd\b', 'road')
df1['Address2A'] = df1['Address2A'].str.replace('.','')
df1['Address2A'] = df1['Address2A'].str.replace('-','')
df1['Address2A'] = df1['Address2A'].str.replace('&','and')
df1['Address2A'].str.replace(r'\brd\b', 'road')
In order to make changing on the fly easier my ideal scenario would be something like:
df1['CompanyA'] = df1['CompanyA'].str.replace(('&','and'), ('.', ''), ('-','')....)
df1['Address1A'] = df1['Address1A'].str.replace(('&','and'), ('.', ''), ('-','')....)
df1['Address2A'] = df1['Address2A'].str.replace(('&','and'), ('.', ''), ('-','')....)
This is so I could just input/change what I wanted to replace for a particular column without having to adjust multiple lines of code.
Is this possible at all?
回答1:
You can create a dictionary and pass it to the function replace()
without needing to chain or name the function so many times.
replacers = {',':'','.':'','-':'','ltd':'limited'} #etc....
df1['CompanyA'] = df1['CompanyA'].replace(replacers)
回答2:
you could chain the replacings:
df1['CompanyA'] = df1['CompanyA'].str.replace('.','').replace('-','').replace(',','').replace('ltd','limited').replace('&','and')
...
回答3:
Replace function accepts values as dictionaries as well. You can do something like this:
df1.replace({'CompanyA' : { '&' : 'and', '.': '' , '-': ''}},regex=True)
回答4:
You can use a dictionary to map the characters for each column:
to_replace = {'.': '',
',': '',
'foo': 'bar'
}
for k, v in to_replace.items():
df1['CompanyA'] = df1['CompanyA'].str.replace(k, v)
回答5:
most likely you use pd.Dataframe so i suggest to make universal remover
def remover(row, replaces):
for k,v in replacers.items():
if k in row:
row = row.replace(k, v)
return row
replacers = {',' : "",
'.':'',
'-':'',
'ltd':'limited'
}
for column in df.columns:
df[column] = df[column].apply(lambda row: remover(row, replacers))
or you can specify specific column names to modify
来源:https://stackoverflow.com/questions/62429677/how-to-use-str-replace-to-replace-multiple-pairs-at-once