What encoding do normal python strings use?

前端 未结 6 1989
太阳男子
太阳男子 2020-12-03 08:08

i know that django uses unicode strings all over the framework instead of normal python strings. what encoding are normal python strings use ? and why don\'t they use unicod

6条回答
  •  [愿得一人]
    2020-12-03 08:37

    In Python 2: Normal strings (Python 2.x str) don't have an encoding: they are raw data.

    In Python 3: These are called "bytes" which is an accurate description, as they are simply sequences of bytes, which can be text encoded in any encoding (several are common!) or non-textual data altogether.

    For representing text, you want unicode strings, not byte strings. By "unicode strings", I mean unicode instances in Python 2 and str instances in Python 3. Unicode strings are sequences of unicode codepoints represented abstractly without an encoding; this is well-suited for representing text.

    Bytestrings are important because to represent data for transmission over a network or writing to a file or whatever, you cannot have an abstract representation of unicode, you need a concrete representation of bytes. Though they are often used to store and represent text, this is at least a little naughty.

    This whole situation is complicated by the fact that while you should turn unicode into bytes by calling encode and turn bytes into unicode using decode, Python will try to do this automagically for you using a global encoding you can set that is by default ASCII, which is the safest choice. Never depend on this for your code and never ever change this to a more flexible encoding--explicitly decode when you get a bytestring and encode if you need to send a string somewhere external.

提交回复
热议问题