问题
I want my dataframe to display the first row names as my dataframe column name instead of numbering from 0 etc. How do I do this?
I tried using pandas and openpyxl modules to turn my Excel spreadsheet into a dataframe.
import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
wb = load_workbook(filename='Budget1.xlsx')
print(wb.sheetnames)
sheet_ranges=wb['May 2019']
print(sheet_ranges['A3'].value)
ws=wb['May 2019']
df=pd.DataFrame(ws.values)
print(df) # This displays my dataframe.
I expect my column titles of my dataframe to display Date, Description, and Amount instead of 0, 1, 2.
回答1:
After reading data dataframe using pandas you can separate first row then use that as column name:
columnNames = df.iloc[0]
df = df[1:]
df.columns = columnNames
Or, you can directly read using pandas that will set first row as column name:
excelDF = pd.ExcelFile('Budget1.xlsx')
df1 = pd.read_excel(excelDF, 'SheetNameThatYouWantTORead')
print(df1.columns)
回答2:
you can reset the columns to be the first row of your dataframe:
df.columns = df.iloc[0, :]
df.drop(df.index[0], inplace=True)
df
来源:https://stackoverflow.com/questions/56981186/how-do-i-use-my-first-row-in-my-spreadsheet-for-my-dataframe-column-names-instea