Read excel sheet with multiple header using Pandas

前端 未结 1 1056
萌比男神i
萌比男神i 2020-12-23 15:22

I have an excel sheet with multiple header like:

_________________________________________________________________________
____|_____|        Header1    |            


        
相关标签:
1条回答
  • 2020-12-23 15:40

    Pandas already has a function that will read in an entire Excel spreadsheet for you, so you don't need to manually parse/merge each sheet. Take a look pandas.read_excel(). It not only lets you read in an Excel file in a single line, it also provides options to help solve the problem you're having.

    Since you have subcolumns, what you're looking for is MultiIndexing. By default, pandas will read in the top row as the sole header row. You can pass a header argument into pandas.read_excel() that indicates how many rows are to be used as headers. In your particular case, you'd want header=[0, 1], indicating the first two rows. You might also have multiple sheets, so you can pass sheetname=None as well (this tells it to go through all sheets). The command would be:

    df_dict = pandas.read_excel('ExcelFile.xlsx', header=[0, 1], sheetname=None)
    

    This returns a dictionary where the keys are the sheet names, and the values are the DataFrames for each sheet. If you want to collapse it all into one DataFrame, you can simply use pandas.concat:

    df = pandas.concat(df_dict.values(), axis=0)
    
    0 讨论(0)
提交回复
热议问题