How do I efficiently parse a CSV file in Perl?

后端未结
关注
 6  1900
独厮守ぢ 2020-11-27 19:19
I\'m working on a project that involves parsing a large csv formatted file in Perl and am looking to make things more efficient.
My approach has been to split(

      
      
        
          6条回答        

        
                    
            
            
                         
                
              
              
                
                   遥遥无期
                                             
                
                
                (楼主)
            
              
              
                2020-11-27 19:53
              

            
            
                        
You can do it in one pass if you read the file line by line. There is no need to read the whole thing into memory at once.

#(no error handling here!)    
open FILE, $filename
while () {
     @csv = split /,/ 

     # now parse the csv however you want.

}


Not really sure if this is significantly more efficient though, Perl is pretty fast at string processing. 

YOU NEED TO BENCHMARK YOUR IMPORT to see what is causing the slowdown. If for example, you are doing a db insertion that takes 85% of the time, this optimization won't work.

 Edit

Although this feels like code golf,  the general algorithm is to read the whole file or part of the fie into a buffer.  

Iterate byte by byte through the buffer until you find a csv delimeter, or a new line.


When you find a delimiter, increment your column count. 
When you find a newline increment your row count.
If you hit the end of your buffer, read more data from the file and repeat.


That's it. But reading a large file into memory is really not the best way, see my original answer for the normal way this is done.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它6个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复