multi-index

easy multidimensional numpy ndarray to pandas dataframe method?

浪子不回头ぞ 提交于 2021-02-07 09:14:03
问题 Having a 4-D numpy.ndarray, e.g. myarr = np.random.rand(10,4,3,2) dims={'time':1:10,'sub':1:4,'cond':['A','B','C'],'measure':['meas1','meas2']} But with possible higher dimensions. How can I create a pandas.dataframe with multiindex, just passing the dimensions as indexes, without further manual adjustments (reshaping the ndarray into 2D shape)? I can't wrap my head around the reshaping, not even really in 3 dimensions quite yet, so I'm searching for an 'automatic' method if possible. What

Show first 10 rows of multi-index pandas dataframe

笑着哭i 提交于 2021-02-07 07:11:42
问题 I have a multilevel index pandas DataFrame where the first level is year and the second level is username . I only have one column which is already sorted in a descending manner. I want to show the first 2 rows of each index level 0. What I have : count year username 2010 b 677 a 505 c 400 d 300 ... 2014 a 100 b 80 What I want : count year username 2010 b 677 a 505 2011 c 677 d 505 2012 e 677 f 505 2013 g 677 i 505 2014 h 677 j 505 回答1: Here is an answer. Maybe there is a better way to do

Python Multiindex Dataframe remove maximum

牧云@^-^@ 提交于 2021-02-07 04:08:23
问题 I am struggling with MultiIndex DataFrame in python pandas. Suppose I have a df like this: count day group name A Anna 10 Monday Beatrice 15 Tuesday B Beatrice 15 Wednesday Cecilia 20 Thursday What I need is to find the maximum in name for each group and remove it from the dataframe. The final df would look like: count day group name A Anna 10 Monday B Beatrice 15 Wednesday Does any of you have any idea how to do this? I am running out of ideas... Thanks in advance! EDIT: What if the original

Python Multiindex Dataframe remove maximum

ぐ巨炮叔叔 提交于 2021-02-07 04:08:15
问题 I am struggling with MultiIndex DataFrame in python pandas. Suppose I have a df like this: count day group name A Anna 10 Monday Beatrice 15 Tuesday B Beatrice 15 Wednesday Cecilia 20 Thursday What I need is to find the maximum in name for each group and remove it from the dataframe. The final df would look like: count day group name A Anna 10 Monday B Beatrice 15 Wednesday Does any of you have any idea how to do this? I am running out of ideas... Thanks in advance! EDIT: What if the original

Python Multiindex Dataframe remove maximum

半世苍凉 提交于 2021-02-07 04:05:59
问题 I am struggling with MultiIndex DataFrame in python pandas. Suppose I have a df like this: count day group name A Anna 10 Monday Beatrice 15 Tuesday B Beatrice 15 Wednesday Cecilia 20 Thursday What I need is to find the maximum in name for each group and remove it from the dataframe. The final df would look like: count day group name A Anna 10 Monday B Beatrice 15 Wednesday Does any of you have any idea how to do this? I am running out of ideas... Thanks in advance! EDIT: What if the original

Python Multiindex Dataframe remove maximum

核能气质少年 提交于 2021-02-07 04:05:56
问题 I am struggling with MultiIndex DataFrame in python pandas. Suppose I have a df like this: count day group name A Anna 10 Monday Beatrice 15 Tuesday B Beatrice 15 Wednesday Cecilia 20 Thursday What I need is to find the maximum in name for each group and remove it from the dataframe. The final df would look like: count day group name A Anna 10 Monday B Beatrice 15 Wednesday Does any of you have any idea how to do this? I am running out of ideas... Thanks in advance! EDIT: What if the original

Blank line below headers created when using MultiIndex and to_excel in Python

家住魔仙堡 提交于 2021-02-07 03:27:53
问题 I am trying to save a Pandas dataframe to an excel file using the to_excel function with XlsxWriter. When I print the dataframe to the terminal then it reads as it should, but when I save it to excel and open the file, there is an extra blank line below the headers which shouldn't be there. This only happens when using MultiIndex for the headers, but I need the layered headers that it offers and I can't find a solution. Below is code from an online MultiIndex example which produces the same

Blank line below headers created when using MultiIndex and to_excel in Python

佐手、 提交于 2021-02-07 03:27:31
问题 I am trying to save a Pandas dataframe to an excel file using the to_excel function with XlsxWriter. When I print the dataframe to the terminal then it reads as it should, but when I save it to excel and open the file, there is an extra blank line below the headers which shouldn't be there. This only happens when using MultiIndex for the headers, but I need the layered headers that it offers and I can't find a solution. Below is code from an online MultiIndex example which produces the same

Python pandas idxmax for multiple indexes in a dataframe

对着背影说爱祢 提交于 2021-02-06 12:45:45
问题 I have a series that looks like this: delivery 2007-04-26 706 23 2007-04-27 705 10 706 1089 708 83 710 13 712 51 802 4 806 1 812 3 2007-04-29 706 39 708 4 712 1 2007-04-30 705 3 706 1016 707 2 ... 2014-11-04 1412 53 1501 1 1502 1 1512 1 2014-11-05 1411 47 1412 1334 1501 40 1502 433 1504 126 1506 100 1508 7 1510 6 1512 51 1604 1 1612 5 Length: 26255, dtype: int64 where the query is: df.groupby([df.index.date, 'delivery']).size() For each day, I need to pull out the delivery number which has

Python pandas idxmax for multiple indexes in a dataframe

南楼画角 提交于 2021-02-06 12:45:08
问题 I have a series that looks like this: delivery 2007-04-26 706 23 2007-04-27 705 10 706 1089 708 83 710 13 712 51 802 4 806 1 812 3 2007-04-29 706 39 708 4 712 1 2007-04-30 705 3 706 1016 707 2 ... 2014-11-04 1412 53 1501 1 1502 1 1512 1 2014-11-05 1411 47 1412 1334 1501 40 1502 433 1504 126 1506 100 1508 7 1510 6 1512 51 1604 1 1612 5 Length: 26255, dtype: int64 where the query is: df.groupby([df.index.date, 'delivery']).size() For each day, I need to pull out the delivery number which has