dynamically skip top blank rows of excel in python pandas

前端未结

关注

 1  1502

I am reading multiple sheets of an excel file using pandas in python. I have three cases

some sheet has data from row 1


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  逝去的感伤        
                
              
                            
                2020-12-18 18:26
              
            
            
                                                                       
I would propose the following algorithm:


Read the whole table
Consider the first row that contains no missing values as a header
Drop all the rows above the header


This code works okay for me:

import pandas as pd
for sheet in range(3):
    raw_data = pd.read_excel('blank_rows.xlsx', sheetname=sheet, header=None)
    print(raw_data)
    # looking for the header row
    for i, row in raw_data.iterrows():
        if row.notnull().all():
            data = raw_data.iloc[(i+1):].reset_index(drop=True)
            data.columns = list(raw_data.iloc[i])
            break
    # transforming columns to numeric where possible
    for c in data.columns:
        data[c] = pd.to_numeric(data[c], errors='ignore')
    print(data)


It uses this toy data sample, based on your examples. From the raw dataframes

         0        1        2
0  Country  Company  Product
1       US      ABC      XYZ
2       US      ABD      XYY

         0        1        2
0      NaN      NaN      NaN
1      NaN      NaN      NaN
2      NaN      NaN      NaN
3  Country  Company  Product
4       US      ABC      XYZ
5       US      ABD      XYY

                                       0        1        2
0  Product summary table for East region      NaN      NaN
1                    Date: 1st Sep, 2016      NaN      NaN
2                                    NaN      NaN      NaN
3                                Country  Company  Product
4                                     US      ABC      XYZ
5                                     US      ABD      XYY


the script produces the same table

  Country Company Product
0      US     ABC     XYZ
1      US     ABD     XYY

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复