问题
I have the following dfe :-
id categ level cols value comment
1 A PG Apple 428 comment1
1 A CD Apple 175 comment1
1 C PG Apple 226 comment1
1 C AB Apple 884 comment1
1 C CD Apple 288 comment1
1 B PG Apple 712 comment1
1 B AB Apple 849 comment1
2 B CD Apple 376 comment1
2 C None Orange 591 comment1
2 B CD Orange 135 comment1
2 D None Orange 423 comment1
2 A AB Orange 1e13 comment1
2 D PG Orange 1e15 comment2
df2 = pd.DataFrame({'s2': {0: 1, 1: 2, 2: 3}, `level': {0: 'PG', 1: 'AB', 2: 'CD'}})
df1 = pd.DataFrame({'sl': {0: 1, 1: 2, 2: 3, 3: 4}, 'set': {0: 'A', 1: 'C', 2: 'B', 3: 'D'}})
dfe = (dfe[['categ','level','cols','id','comment','value']]
.merge(df1.rename({'set' : 'categ'}, axis=1),how='left',on='categ')
.merge(df2, how='left', on='level'))
na = dfe['level'].isna()
dfs = {'no_null': dfe[~na], 'null': dfe[na]}
with pd.ExcelWriter('XYZ.xlsx') as writer:
for p,r in dfs.items():
if p== 'no_null':
c= ['cols','s2','level']
else:
c = 'cols'
df = r.pivot_table(index=['id','sl','comment','categ'], columns=c, values=['value'])
df.columns = df.columns.droplevel([0,2])
df = df.reset_index().drop(('sl',''), axis=1).set_index('categ')
for (id,comment), sdf in df.groupby(['id','comment']):
df = sdf.reset_index(level=[1], drop=True).dropna(how='all', axis=1)
df.to_excel(writer,sheet_name=name)
Running this I get results displayed in excel this way :-
I want to order in certain way, what I tried :-
df = r.pivot_table(index=['id','sl','comment','categ'], columns=c, values='value')
df.columns = df.columns.droplevel([1])
df = df.reset_index().drop(('sl',''), axis=1).set_index('categ')
This gives me Too many levels: Index has only 2 levels, not 3 error, I don't know what Im missing /wrong here .
My expected output for arrangement of headings is :-
Would like to know if headings can be written to excel in CAPS as shown in expected output.
EDIT 1 I tried the answer and Im getting this view :-
I want to be able to display ID & COMMENT only once (as its already grouped by ID in code logic), and drop the sl column and the first column 0,1,2 and also delete the blank row above 0
回答1:
Given dfe as:
categ level cols id comment value sl s2
0 A PG Apple 1 comment1 4.280000e+02 1 1.0
1 A CD Apple 1 comment1 1.750000e+02 1 3.0
2 C PG Apple 1 comment1 2.260000e+02 2 1.0
3 C AB Apple 1 comment1 8.840000e+02 2 2.0
4 C CD Apple 1 comment1 2.880000e+02 2 3.0
5 B PG Apple 1 comment1 7.120000e+02 3 1.0
6 B AB Apple 1 comment1 8.490000e+02 3 2.0
7 B CD Apple 2 comment1 3.760000e+02 3 3.0
8 C None Orange 2 comment1 5.910000e+02 2 NaN
9 B CD Orange 2 comment1 1.350000e+02 3 3.0
10 D None Orange 2 comment1 4.230000e+02 4 NaN
11 A AB Orange 2 comment1 1.000000e+13 1 2.0
12 D PG Orange 2 comment2 1.000000e+15 4 1.0
Then try:
df = dfe.pivot_table(index=['id','comment','categ'], columns=c, values='value')
df.columns = df.columns.droplevel([1])
df = (df.rename_axis(columns=[None, None])
.reset_index(col_level=1)
.rename(columns = lambda x: x.upper()))
df.to_excel('testa1.xlsx')
Output:
Notes:
- Removed [] around 'value' in
pivot_tableto not include 'value' as a column index. - Aligned 'categ', 'label' and 'comments' with column index level 1 using
col_levelparameter. - See this post about the blank line, https://stackoverflow.com/a/52498899/6361531.
回答2:
I think it would be easier to drop columns name and the replace it with a custome one:
df.columns = df.columns.droplevel()
df.columns = pd.MultiIndex.from_tuples([("", "ID"), ("", "CATEG"), ("apple", "PG"), ("apple", "AB"), ("apple", "CD"), ("orange", "PG"), ("orange", "AB"), ("orange", "CD")])
来源:https://stackoverflow.com/questions/64395699/how-to-drop-row-index-and-flatten-index-in-this-way