Removing substring of from a list of strings

前端 未结 4 1917
长发绾君心
长发绾君心 2021-01-29 07:02

There are several countries with numbers and/or parenthesis in my list. How I remove these?

e.g.

\'Bolivia (Plurinational State of)\' should be \'Bolivi

4条回答
  •  半阙折子戏
    2021-01-29 07:47

    Use Series.str.replace with regex for replacement, \s* is for possible spaces before (, then \(.*\) is for values () and values between | is for regex or and \d+ is for numbers with 1 or more digits:

    df = pd.DataFrame({'a':['Bolivia (Plurinational State of)','Switzerland17']})
    
    df['a'] = df['a'].str.replace('(\s*\(.*\)|\d+)','')
    print (df)
                 a
    0      Bolivia
    1  Switzerland
    

提交回复
热议问题