The fastest way to read input in Python

后端未结

关注

 6  1570

孤街浪徒 2020-11-27 07:56

I want to read a huge text file that contains list of lists of integers. Now I\'m doing the following:

G = []
with open(\"test.txt\", \'r\') as f:
    for li


      
      
        
          6条回答        

        
                    
            
            
                         
                
              
              
                
                   陌清茗
                                             
                
                
                (楼主)
            
              
              
                2020-11-27 08:25
              

            
            
                        
pandas which is based on numpy has a C based file parser which is very fast:

# generate some integer data (5 M rows, two cols) and write it to file
In [24]: data = np.random.randint(1000, size=(5 * 10**6, 2))

In [25]: np.savetxt('testfile.txt', data, delimiter=' ', fmt='%d')

# your way
In [26]: def your_way(filename):
   ...:     G = []
   ...:     with open(filename, 'r') as f:
   ...:         for line in f:
   ...:             G.append(list(map(int, line.split(','))))
   ...:     return G        
   ...: 

In [26]: %timeit your_way('testfile.txt', ' ')
1 loops, best of 3: 16.2 s per loop

In [27]: %timeit pd.read_csv('testfile.txt', delimiter=' ', dtype=int)
1 loops, best of 3: 1.57 s per loop


So pandas.read_csv takes about one and a half second to read your data and is about 10 times faster than your method.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它6个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复