问题
Consider I have a Pandas Dataframe with the following format.
Date Product cost|us|2019 cost|us|2020 cost|us|2021 cost|de|2019 cost|de|2020 cost|de|2021
01/01/2020 prodA 10 12 14 12 13 15
How can we convert it into the following format?
Date Product Year cost|us cost|de
01/01/2020 ProdA 2019 10 12
01/01/2020 ProdA 2020 12 13
01/01/2020 ProdA 2021 14 15
回答1:
Convert non year columns to MultiIndex by DataFrame.set_index, then use str.rsplit by columns by last |, set new column nmae in DataFrame.rename_axis and reshape by DataFrame.stack:
df = df.set_index(['Date','Product'])
df.columns = df.columns.str.rsplit('|', n=1, expand=True)
df = df.rename_axis([None, 'Year'], axis=1).stack().reset_index()
print (df)
Date Product Year cost|de cost|us
0 01/01/2020 prodA 2019 12 10
1 01/01/2020 prodA 2020 13 12
2 01/01/2020 prodA 2021 15 14
来源:https://stackoverflow.com/questions/65563274/pandas-dataframe-covert-wide-to-long-multiple-columns-with-name-from-column-name