问题
I'm trying to compare the speed efficiency of two tools that would allow to save 2 GB of numpy array
to disk into a file : numpy.save
and h5py.create_dataset
.
(Note : this is just a first test, the real case I have to deal with, is several thousands of numpy arrays of size between 1 and 2 MB, ie several GB at the end)
Here is the code I use for doing the benchmark. The problem is that the results are really inconsistent :
import numpy as np
import h5py
import time
def writemem():
myarray = np.random.randint(100000,size=512*1024*1024) # 2 GB
t0 = time.time()
h5f = h5py.File('test.h5', 'w')
h5f.create_dataset('array2', data = myarray)
h5f.close()
print time.time() - t0
def writemem2():
myarray = np.random.randint(100000,size=512*1024*1024) # 2 GB
t0 = time.time()
f = open('test.bin', 'w')
np.save(f, myarray)
f.close()
print time.time() - t0
writemem() # 55s 38s 42s 38s
raw_input()
writemem2() # 46s 17s 22s 15s 22s
raw_input()
How to do a proper benchmarking of these tools for 2 GB of data ?
来源:https://stackoverflow.com/questions/22363061/how-to-get-consistent-results-when-compare-speed-of-numpy-save-and-h5py