What endianness does Python use to write into files?

社会主义新天地 提交于 2019-12-05 04:30:11

问题


When using file.write() with 'wb' flag does Python use big or litte endian, or sys.byteorder value ? how can i be sure that the endianness is not random, I am asking because I am mixing ASCII and binary data in the same file and for the binary data i use struct.pack() and force it to little endian, but I am not sure what happen to the ASCII data !

Edit 1: since the downvote, I'll explain more my question !

I am writing a file with ASCII and binary data, in a x86 PC, the file will be sent over the network to another computer witch is not x86, a PowerPC, witch is on Big-endian, how can I be sure that the data will be the same when parsed with the PowerPC ?

Edit 2: still using Python 2.7


回答1:


For multibyte data, It follows the architecture of the machine by default. If you need it to work cross-platform, then you'll want to force it.

ASCII and UTF-8 are encoded as a single byte per character, so is it affected by the byte ordering? No.

Here is how to pack little < or big > endian:

import struct

struct.pack('<L', 1234)
'\xd2\x04\x00\x00'

struct.pack('>L', 1234)
'\x00\x00\x04\xd2'

You can also encode strings as big or little endian this way if you are using UTF-16, as an example:

s.encode('utf-16LE')
s.encode('utf-16BE')

UTF-8, ASCII do not have endianness since it is 1 byte per character.




回答2:


It uses sys.byteorder. So just:

import sys

if 'little' == sys.byteorder:
     # little
 else:
     # big



回答3:


Note: I assume Python 3.

Endianness is not a concern when writing ASCII or byte strings. The order of the bytes is already set by the order in which those bytes occur in the ASCII/byte string. Endianness is a property of encodings that maps some value (e.g. a 16 bit integer or a Unicode code point) to several bytes. By the time you have a byte string, the endianness has already been decided and applied (by the source of the byte string).

If you were to write unicode strings to file not opened with b mode, the question depends on how those strings are encoded (they are necessarily encoded, because the file system only accept bytes). The encoding in turn depends on the file, and possibly on the locale or environment variables (e.g. for the default sys.stdout). When this causes problems, the problems extend beyond just endianness. However, your file is binary, so you can't write unicode directly anyway, you have to explicitly encode and decode. Do this with any fixed encoding and there won't be endianness issues, as an encoding's endianness is fixed and part of the definition of the encoding.



来源:https://stackoverflow.com/questions/23831422/what-endianness-does-python-use-to-write-into-files

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!