What endianness does Python use to write into files?

孤街浪徒 提交于 2019-12-03 20:27:11

For multibyte data, It follows the architecture of the machine by default. If you need it to work cross-platform, then you'll want to force it.

ASCII and UTF-8 are encoded as a single byte per character, so is it affected by the byte ordering? No.

Here is how to pack little < or big > endian:

import struct

struct.pack('<L', 1234)
'\xd2\x04\x00\x00'

struct.pack('>L', 1234)
'\x00\x00\x04\xd2'

You can also encode strings as big or little endian this way if you are using UTF-16, as an example:

s.encode('utf-16LE')
s.encode('utf-16BE')

UTF-8, ASCII do not have endianness since it is 1 byte per character.

It uses sys.byteorder. So just:

import sys

if 'little' == sys.byteorder:
     # little
 else:
     # big

Note: I assume Python 3.

Endianness is not a concern when writing ASCII or byte strings. The order of the bytes is already set by the order in which those bytes occur in the ASCII/byte string. Endianness is a property of encodings that maps some value (e.g. a 16 bit integer or a Unicode code point) to several bytes. By the time you have a byte string, the endianness has already been decided and applied (by the source of the byte string).

If you were to write unicode strings to file not opened with b mode, the question depends on how those strings are encoded (they are necessarily encoded, because the file system only accept bytes). The encoding in turn depends on the file, and possibly on the locale or environment variables (e.g. for the default sys.stdout). When this causes problems, the problems extend beyond just endianness. However, your file is binary, so you can't write unicode directly anyway, you have to explicitly encode and decode. Do this with any fixed encoding and there won't be endianness issues, as an encoding's endianness is fixed and part of the definition of the encoding.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!