问题
How to remove those "\x00\x00" in a string ? I have many of those strings (example shown below). I can use re.sub to replace those "\x00". But I am wondering whether there is a better way to do that ? Converting between unicode, bytes and string is always confusing.
'Hello\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'.
回答1:
Use rstrip
>>> text = 'Hello\x00\x00\x00\x00'
>>> text.rstrip('\x00')
'Hello'
It removes all \x00
characters at the end of the string.
回答2:
>>> a = 'Hello\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> a.replace('\x00','')
'Hello'
回答3:
I think the more general solution is to use:
cleanstring = nullterminatedstring.split('\x00',1)[0]
Which will split
the string using \x00
as the delimeter 1
time. The split(...)
returns a 2 element list: everything before the null in addition to everything after the null (it removes the delimeter). Appending [0]
only returns the portion of the string before the first null (\x00) character, which I believe is what you're looking for.
The convention in some languages, specifically C-like, is that a single null character marks the end of the string. For example, you should also expect to see strings that look like:
'Hello\x00dpiecesofsomeoldstring\x00\x00\x00'
The answer supplied here will handle that situation as well as the other examples.
回答4:
Building on the answers supplied, I suggest that strip() is more generic than rstrip() for cleaning up data, as strip() removes chars from the beginning and the end of the supplied string, whereas rstrip() simply removes chars from the end of the string.
However, NUL chars are not treated as whitespace by default by strip(), and as such you need to specify explicitly. This can catch you out, as print() will of course not show the NUL chars. My solution that I used was to clean the string using ".strip().strip('\x00')
":
>>> arbBytesFromSocket = b'\x00\x00\x00\x00hello\x00\x00\x00\x00'
>>> arbBytesAsString = arbBytesFromSocket.decode('ascii')
>>> print(arbBytesAsString)
hello
>>> str(arbBytesAsString)
'\x00\x00\x00\x00hello\x00\x00\x00\x00'
>>> arbBytesAsString = arbBytesFromSocket.decode('ascii').strip().strip('\x00')
>>> str(arbBytesAsString)
'hello'
>>>
This gives you the string required, without the NUL chars.
来源:https://stackoverflow.com/questions/38883476/how-to-remove-those-x00-x00