The fastest way to read input in Python

后端 未结 6 1570
孤街浪徒
孤街浪徒 2020-11-27 07:56

I want to read a huge text file that contains list of lists of integers. Now I\'m doing the following:

G = []
with open(\"test.txt\", \'r\') as f:
    for li         


        
6条回答
  •  陌清茗
    陌清茗 (楼主)
    2020-11-27 08:25

    pandas which is based on numpy has a C based file parser which is very fast:

    # generate some integer data (5 M rows, two cols) and write it to file
    In [24]: data = np.random.randint(1000, size=(5 * 10**6, 2))
    
    In [25]: np.savetxt('testfile.txt', data, delimiter=' ', fmt='%d')
    
    # your way
    In [26]: def your_way(filename):
       ...:     G = []
       ...:     with open(filename, 'r') as f:
       ...:         for line in f:
       ...:             G.append(list(map(int, line.split(','))))
       ...:     return G        
       ...: 
    
    In [26]: %timeit your_way('testfile.txt', ' ')
    1 loops, best of 3: 16.2 s per loop
    
    In [27]: %timeit pd.read_csv('testfile.txt', delimiter=' ', dtype=int)
    1 loops, best of 3: 1.57 s per loop
    

    So pandas.read_csv takes about one and a half second to read your data and is about 10 times faster than your method.

提交回复
热议问题