How to get consistent results when compare speed of numpy.save and h5py?

让人想犯罪 __ 提交于 2019-12-11 01:03:58

问题


I'm trying to compare the speed efficiency of two tools that would allow to save 2 GB of numpy array to disk into a file : numpy.save and h5py.create_dataset.

(Note : this is just a first test, the real case I have to deal with, is several thousands of numpy arrays of size between 1 and 2 MB, ie several GB at the end)

Here is the code I use for doing the benchmark. The problem is that the results are really inconsistent :

import numpy as np
import h5py
import time

def writemem():    
    myarray = np.random.randint(100000,size=512*1024*1024)  # 2 GB
    t0 = time.time()
    h5f = h5py.File('test.h5', 'w')
    h5f.create_dataset('array2', data = myarray) 
    h5f.close()
    print time.time() - t0    

def writemem2():    
    myarray = np.random.randint(100000,size=512*1024*1024)  # 2 GB
    t0 = time.time()
    f = open('test.bin', 'w')
    np.save(f, myarray)
    f.close()
    print time.time() - t0              

writemem()                # 55s 38s 42s 38s
raw_input()
writemem2()               # 46s 17s 22s 15s 22s
raw_input()

How to do a proper benchmarking of these tools for 2 GB of data ?

来源:https://stackoverflow.com/questions/22363061/how-to-get-consistent-results-when-compare-speed-of-numpy-save-and-h5py

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!