Removing substring of from a list of strings

前端 未结 4 1915
长发绾君心
长发绾君心 2021-01-29 07:02

There are several countries with numbers and/or parenthesis in my list. How I remove these?

e.g.

\'Bolivia (Plurinational State of)\' should be \'Bolivi

4条回答
  •  独厮守ぢ
    2021-01-29 07:40

    Using Regex and simple List Operation

    Go through the list items, find the regex matching in each item, and replace the values in place. This regex "[a-zA-Z]{2,}" works for only string matching with the minimum size of two or more. It gives your freedom based on parenthesis. The better approach for Regex is to use Matching string based on your input domain (i.e country in your case) and a Country name cannot have a number in its name or Parenthesis. SO you should use the following.

    import re 
    list_of_country_strings = ["Switzerland17", "America290","Korea(S)"]
    for index in range(len(list_of_country_strings)):
        x = re.match("[a-zA-Z]{2,}",string = list_of_country_strings[index])
        if x:
            list_of_country_strings[index] = list_of_country_strings[index][x.start():x.end()]
    
    print(list_of_country_strings)
    

    Output ['Switzerland', 'America', 'Korea']

提交回复
热议问题