问题
I'm using Python v2.6 and I have a string which contains a number of punctuation characters I'd like to strip out. Now I've looked at using the string.punctuation()
function but unfortunately, I want to strip out all punctuation characters except fullstops and dashes. In total, there are only a total of 5 punctuation marks I'd like to strip out - ()\"'
Any suggestions? I'd like this to be the most efficient way.
Thanks
回答1:
You can use str.translate(table[, deletechars]) with table
set to None
, which will result in all characters from deletechars
being removed from the string:
s.translate(None, r"()\"'")
Some examples:
>>> "\"hello\" '(world)'".translate(None, r"()\"'")
'hello world'
>>> "a'b c\"d e(f g)h i\\j".translate(None, r"()\"'")
'ab cd ef gh ij'
回答2:
You could make a list of all the characters you don't want:
unwanted = ['(', ')', '\\', '"', '\'']
Then you could make a function strip_punctuation(s)
like so:
def strip_punctuation(s):
for u in unwanted:
s = s.replace(u, '')
return s
回答3:
>>> import re
>>> r = re.compile("[\(\)\\\\'\"]")
>>> r.sub("", "\"hello\" '(world)'\\\\\\")
'hello world'
回答4:
Using string.translate:
s = ''' abc(de)f\gh"i' '''
print(s.translate(None, r"()\"'"))
# abcdefghi
or re.sub:
import re
re.sub(r"[\\()'\"]",'',s)
but string.translate
appears to be an order of magnitude faster:
In [148]: %timeit (s*1000).translate(None, r"()\"'")
10000 loops, best of 3: 112 us per loop
In [146]: %timeit re.sub(r"[\\()'\"]",'',s*1000)
100 loops, best of 3: 2.11 ms per loop
回答5:
You can create a dict of all the characters you want to be replaced and replace them with char of your choice.
char_replace = {"'":"" , "(":"" , ")":"" , "\":"" , """:""}
for i,j in char_replace.iteritems():
string = string.replace(i,j)
回答6:
my_string = r'''\(""Hello ''W\orld)'''
strip_chars = r'''()\'"'''
using comprehension:
''.join(x for x in my_string if x not in strip_chars)
using filter:
''.join(filter(lambda x: x not in strip_chars, my_string))
output:
Hello World
来源:https://stackoverflow.com/questions/8857885/strip-specific-punctuation-in-python-2-x