Converting bytes to string with str() returns string with speech marks

喜你入骨 提交于 2019-12-02 06:55:30

Don't think of a bytes value as a string in some default 8-bit encoding. It's just binary data. As such, str(a) returns an encoding-agnostic string to represent the value of the byte string. If you want 'Hello World', be specific and decode the value.

>>> b = a.decode()
>>> type(b)
>>> str
>>> print(b)
Hello World

In Python 2, the distinction between bytes and text was blurred. Python 3 went to great lengths to separate the two: bytes for binary data, and str for readable text.

For another perspective, compare

>>> list("Hello")
['H', 'e', 'l', 'l', 'o']

with

>>> list(b"Hello")
[72, 101, 108, 108, 111]

What str(b) does here is convert bytes to a string by trying to call thing.__str__, which fails because bytes have no __str__ and then falling back to __repr__, which returns the string required to create this object in the repl.

If you think about it, just converting bytes to a str makes little sense, as you need to know the encoding. You can use bytes.decode(encoding) to convert bytes to str properly.

b.decode("utf-8")

The encoding can also be left empty, in which case a default (likely utf-8) will be chosen.

str usually transforms an object into a string that represents it. There is no better representation than b'contains' of a bytes object. You probably want to use decode, where you also specify encoding of the bytes object, that should be used when transforming to string

In Python 3.x, when you type-cast byte string using str(s), it creates a new string as b'Hello World' (keeping the "b" denoting byte string at the start). It is because byte-string doesn't have a __str__ function defined. Hence, it makes the call to __repr__ which returns the same string which byte used for the representation of it's object values (i.e string preceded by "b"). For example:

>>> a = b'Hello World'
>>> str(a)
"b'Hello World'"

There are two ways to convert byte-like object to string. For example:

  1. Decode byte-string to string: You can decode your byte-string a to string as:

    >>> a.decode()
    'Hello World'
    
  2. Convert byte-string to utf-8 string as:

    >>> str(a, 'utf-8')
    'Hello World'
    
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!