问题
Below is my dataframe:
In [2804]: df = pd.DataFrame({'A':[1,2,3,4,5,6], 'D':[{"value": '126', "perc": None, "unit": None}, {"value": 324, "perc": None, "unit": None}, {"value": 'N/A', "perc": None, "unit": None}, {}, {"value": '100', "perc": None, "unit": None}, np.nan]})
In [2794]: df.columns = pd.MultiIndex.from_product([df.columns, ['E']])
In [2807]: df
Out[2807]:
A D
E E
0 1 {'value': '126', 'perc': None, 'unit': None}
1 2 {'value': 324, 'perc': None, 'unit': None}
2 3 {'value': 'N/A', 'perc': None, 'unit': None}
3 4 {}
4 5 {'value': '100', 'perc': None, 'unit': None}
5 6 NaN
I need to sort the multi-level column with index (D,E)
in descending order based on value
key from dict
.
As you can see value
key can have values in mixed datatypes like int, string
or empty like {}
, or NaN
.
N/A
and Nan
values should always appear at last after sorting(both asc and desc).
Expected output:
In [2814]: df1 = pd.DataFrame({'A':[2,1,5,3,4,6], 'D':[{"value": 324, "perc": None, "unit": None}, {"value": '126', "perc": None, "unit": None}, {"value": '100', "perc": None, "unit": None}, {"value": 'N/A', "perc": None, "unit": None}, {},np.nan]})
In [2799]: df1.columns = pd.MultiIndex.from_product([df1.columns, ['E']])
In [2811]: df1
Out[2811]:
A D
E E
0 2 {'value': 324, 'perc': None, 'unit': None}
1 1 {'value': '126', 'perc': None, 'unit': None}
2 5 {'value': '100', 'perc': None, 'unit': None}
3 3 {'value': 'N/A', 'perc': None, 'unit': None}
4 4 {}
5 6 NaN
回答1:
Create helper column filled by numeric and sorting by this column:
df['tmp'] = pd.to_numeric(df[('D','E')].str.get('value'), errors='coerce')
df1 = df.sort_values('tmp', ascending=False).drop('tmp', axis=1)
print (df1)
A D
E E
1 2 {'value': 324, 'perc': None, 'unit': None}
0 1 {'value': '126', 'perc': None, 'unit': None}
4 5 {'value': '100', 'perc': None, 'unit': None}
2 3 {'value': 'N/A', 'perc': None, 'unit': None}
3 4 {}
5 6 NaN
df1 = df.sort_values('tmp').drop('tmp', axis=1)
print (df1)
A D
E E
4 5 {'value': '100', 'perc': None, 'unit': None}
0 1 {'value': '126', 'perc': None, 'unit': None}
1 2 {'value': 324, 'perc': None, 'unit': None}
2 3 {'value': 'N/A', 'perc': None, 'unit': None}
3 4 {}
5 6 NaN
来源:https://stackoverflow.com/questions/64571500/pandas-sort-a-multiindex-dataframes-multi-level-column-with-mixed-datatypes