How can I customize what characters are filtered out using string.punctuation?

☆樱花仙子☆ 提交于 2021-01-27 18:52:03

问题


I have a string with which I would like to remove all punctuation. I currently use:

import string
translator = str.maketrans('','', string.punctuation)
name = name.translate(translator)

However, for strings which are names this removed the hyphen also, which I would like to keep in the string. For Instance '\Fred-Daniels!" Should become "Fred-Daniels".

How can I modify the above code to achieve this?


回答1:


If you'd like to exclude some punctuation characters from string.puncation, you can simply remove the ones you don't want considered:

>>> from string import punctuation
>>> from re import sub
>>> 
>>> string = "\Fred-Daniels!"
>>> translator = str.maketrans('','', sub('\-', '', punctuation))
>>> string
'\\Fred-Daniels!'
>>> string = string.translate(translator)
>>> string
'Fred-Daniels'

Note if it's only one or two characters you want to exclude, you should use str.replace. Otherwise, its best to just stick with re.sub.




回答2:


import string

PUNCT_TO_REMOVE = string.punctuation
print(PUNCT_TO_REMOVE) # Output : !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~

# Now suppose you don't want _ in your PUNCT_TO_REMOVE

PUNCT_TO_REMOVE = PUNCT_TO_REMOVE.replace("_","")
print(PUNCT_TO_REMOVE) # Output : !"#$%&'()*+,-./:;<=>?@[\]^`{|}~



回答3:


Depending on the use case, it could be safer and clearer to explicitly list the valid characters:

>>> name = '\\test-1.'
>>> valid_characters = 'abcdefghijklmnopqrstuvwxyz1234567890- '
>>> filtered_name = ''.join([ x for x in name if x.lower() in valid_characters ])
>>> print(filtered_name)
test-1

Note that many people have names that include punctuation though, like "Mary St. Cloud-Stevens", "Jim Chauncey, Jr.", etc.



来源:https://stackoverflow.com/questions/45427439/how-can-i-customize-what-characters-are-filtered-out-using-string-punctuation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!