问题
I am looking to an efficient method to drop duplicate columns in a multiindex dataframe with Pandas.
My data :
TypePoint TIME Test ... T1 T1
- S Unit1 ... unit unit
(POINT, -) ...
24001 90.00 100.000 ... 303.15 303.15
24002 390.00 101.000 ... 303.15 303.15
... ... ... ... ...
24801 10000 102.000 ... 303.15 303.15
24802 10500 103.000 ... 303.15 303.15
The header contain two information. The variable's name and its unit. I would like to drop the variable "T1" (duplicate variable).
.drop_duplicates() don't work. I get "Buffer has wrong number of dimensions (expected 1, got 2)" error.
.drop(Data('T1','unit'),axis=1) don't work either. That drop the two column and not just only one of them.
Thanks for your help
回答1:
I think you can use double T:
print df
TypePoint TIME Test T1
- S Unit1 unit unit
0 24001 90 100 303.15 303.15
1 24002 390 101 303.15 303.15
2 24801 10000 102 303.15 303.15
3 24802 10500 103 303.15 303.15
print df.T.drop_duplicates().T
TypePoint TIME Test T1
- S Unit1 unit
0 24001 90 100 303.15
1 24002 390 101 303.15
2 24801 10000 102 303.15
3 24802 10500 103 303.15
来源:https://stackoverflow.com/questions/35888189/drop-duplicate-in-multiindex-dataframe-in-pandas