Numpy loading csv TOO slow compared to Matlab

前端 未结 5 1629
無奈伤痛
無奈伤痛 2020-12-01 05:08

I posted this question because I was wondering whether I did something terribly wrong to get this result.

I have a medium-size csv file and I tried to use numpy to l

5条回答
  •  再見小時候
    2020-12-01 05:59

    If you want to just save and read a numpy array its much better to save it as a binary or compressed binary depending on size:

    my_data = np.random.rand(1500000, 3)*10
    np.savetxt('./test.csv', my_data, delimiter=',', fmt='%.2f')
    np.save('./testy', my_data)
    np.savez('./testz', my_data)
    del my_data
    
    setup_stmt = 'import numpy as np'
    stmt1 = """\
    my_data = np.genfromtxt('./test.csv', delimiter=',')
    """
    stmt2 = """\
    my_data = np.load('./testy.npy')
    """
    stmt3 = """\
    my_data = np.load('./testz.npz')['arr_0']
    """
    
    t1 = timeit.timeit(stmt=stmt1, setup=setup_stmt, number=3)
    t2 = timeit.timeit(stmt=stmt2, setup=setup_stmt, number=3)
    t3 = timeit.timeit(stmt=stmt3, setup=setup_stmt, number=3)
    
    genfromtxt 39.717250824
    save 0.0667860507965
    savez 0.268463134766
    

提交回复
热议问题