python/pandas: using regular expressions remove anything in square brackets in string

生来就可爱ヽ(ⅴ<●) 提交于 2021-02-05 06:41:48

问题


Working from a pandas dataframe trying to sanitize a column from something like $12,342 to 12342 and make the column into an int or float. Found one row though with 736[4] so I have to remove everything within the square brackets, brackets included.

Code so far

df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace('$','')
df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(',','')
df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(' ','')

The line below is what's supposed to handle and remove the square brackets and intentionally with it's content too.

df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(r'[[^]]*\)','')

To some dev's this is trivial but I've not really used regular expressions often enough to know this and I've also checked around and from one such stack example formulated the above.


回答1:


I think you need:

df2 = pd.DataFrame({'Average Monthly Wage $': ['736[4]','7336[445]', '[4]345[5]']})
print (df2)
  Average Monthly Wage $
0                 736[4]
1              7336[445]
2              [4]345[5]

df2['Average Monthly Wage $'] = df2['Average Monthly Wage $'].str.replace(r'\[.*?\]','')
print (df2)
  Average Monthly Wage $
0                    736
1                   7336
2                    345

regex101.



来源:https://stackoverflow.com/questions/51340686/python-pandas-using-regular-expressions-remove-anything-in-square-brackets-in-s

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!