How to output list of floats to a binary file in Python

匿名 (未验证) 提交于 2019-12-03 01:58:03

问题:

I have a list of floating-point values in Python:

floats = [3.14, 2.7, 0.0, -1.0, 1.1] 

I would like to write these values out to a binary file using IEEE 32-bit encoding. What is the best way to do this in Python? My list actually contains about 200 MB of data, so something "not too slow" would be best.

Since there are 5 values, I just want a 20-byte file as output.

回答1:

Alex is absolutely right, it's more efficient to do it this way:

from array import array output_file = open('file', 'wb') float_array = array('d', [3.14, 2.7, 0.0, -1.0, 1.1]) float_array.tofile(output_file) output_file.close() 

And then read the array like that:

input_file = open('file', 'r') float_array = array('d') float_array.fromstring(input_file.read()) 

array.array objects also have a .fromfile method which can be used for reading the file, if you know the count of items in advance (e.g. from the file size, or some other mechanism)



回答2:

See: Python's struct module

import struct s = struct.pack('f'*len(floats), *floats) f = open('file','wb') f.write(s) f.close() 


回答3:

The array module in the standard library may be more suitable for this task than the struct module which everybody is suggesting. Performance with 200 MB of data should be substantially better with array.

If you'd like to take at a variety of options, try profiling on your system with something like this



回答4:

I'm not sure how NumPy will compare performance-wise for your application, but it may be worth investigating.

Using NumPy:

from numpy import array a = array(floats,'float32') output_file = open('file', 'wb') a.tofile(output_file) output_file.close() 

results in a 20 byte file as well.



回答5:

have a look at struct.pack_into



回答6:

struct.pack() looks like what you need.

http://docs.python.org/library/struct.html



回答7:

I ran into a similar issue while inadvertently writing a 100+ GB csv file. The answers here were extremely helpful but, to get to the bottom of it, I profiled all of the solutions mentioned and then some. All profiling runs were done on a 2014 Macbook Pro with a SSD using python 2.7. From what I'm seeing, the struct approach is definitely the fastest from a performance point of view:

6.465 seconds print_approach    print list of floats 4.621 seconds csv_approach      write csv file 4.819 seconds csvgz_approach    compress csv output using gzip 0.374 seconds array_approach    array.array.tofile 0.238 seconds numpy_approach    numpy.array.tofile 0.178 seconds struct_approach   struct.pack method 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!