multi-index | 易学教程

Giving a column multiple indexes/headers

阅读更多关于 Giving a column multiple indexes/headers

问题 I am working with pandas dataframes that are essentially time series like this: level Date 1976-01-01 409.67 1976-02-01 409.58 1976-03-01 409.66 … What I want to have, is multiple indexes/headers for the level column, like so: Station1 #Name of the datasource 43.1977317,-4.6473648,5 #Lat/Lon of the source Precip #Type of data Date 1976-01-01 409.67 1976-02-01 409.58 1976-03-01 409.66 … So essentially I am searching for something like Mydata.columns.level1 = ['Station1'] , Mydata.columns

Pandas : Proper way to set values based on condition for subset of multiindex dataframe

阅读更多关于 Pandas : Proper way to set values based on condition for subset of multiindex dataframe

问题 I'm not sure of how to do this without chained assignments (which probably wouldn't work anyways because I'd be setting a copy). I wan't to take a subset of a multiindex pandas dataframe, test for values less than zero and set them to zero. For example: df = pd.DataFrame({('A','a'): [-1,-1,0,10,12], ('A','b'): [0,1,2,3,-1], ('B','a'): [-20,-10,0,10,20], ('B','b'): [-200,-100,0,100,200]}) df[df['A']<0] = 0.0 gives In [37]: df Out[37]: A B a b a b 0 -1 0 -20 -200 1 -1 1 -10 -100 2 0 2 0 0 3 10

Nested dictionary to multiindex dataframe where dictionary keys are column labels

阅读更多关于 Nested dictionary to multiindex dataframe where dictionary keys are column labels

Say I have a dictionary that looks like this: dictionary = {'A' : {'a': [1,2,3,4,5], 'b': [6,7,8,9,1]}, 'B' : {'a': [2,3,4,5,6], 'b': [7,8,9,1,2]}} and I want a dataframe that looks something like this: A B a b a b 0 1 6 2 7 1 2 7 3 8 2 3 8 4 9 3 4 9 5 1 4 5 1 6 2 Is there a convenient way to do this? If I try: In [99]: DataFrame(dictionary) Out[99]: A B a [1, 2, 3, 4, 5] [2, 3, 4, 5, 6] b [6, 7, 8, 9, 1] [7, 8, 9, 1, 2] I get a dataframe where each element is a list. What I need is a multiindex where each level corresponds to the keys in the nested dict and the rows corresponding to each

Selecting rows from a Pandas dataframe with a compound (hierarchical) index

阅读更多关于 Selecting rows from a Pandas dataframe with a compound (hierarchical) index

问题 I'm suspicious that this is trivial, but I yet to discover the incantation that will let me select rows from a Pandas dataframe based on the values of a hierarchical key. So, for example, imagine we have the following dataframe: import pandas df = pandas.DataFrame({'group1': ['a','a','a','b','b','b'], 'group2': ['c','c','d','d','d','e'], 'value1': [1.1,2,3,4,5,6], 'value2': [7.1,8,9,10,11,12] }) df = df.set_index(['group1', 'group2']) df looks as we would expect: If df were not indexed on

Pandas pivot table for multiple columns at once

阅读更多关于 Pandas pivot table for multiple columns at once

问题 Let's say I have a DataFrame: nj ptype wd wpt 0 2 1 2 1 1 3 2 1 2 2 1 1 3 1 3 2 2 3 3 4 3 1 2 2 I would like to aggregate this data using ptype as the index like so: nj wd wpt 1.0 2.0 3.0 1.0 2.0 3.0 1.0 2.0 3.0 ptype 1 1 1 1 0 2 1 2 1 0 2 0 1 1 1 0 1 0 1 1 You could build each one of the top level columns for the final value by creating a pivot table with aggfunc='count' and then concatenating them all, like so: nj = df.pivot_table(index='ptype', columns='nj', aggfunc='count').ix[:, 'wd']

reading excel sheet as multiindex dataframe through pd.read_excel()

阅读更多关于 reading excel sheet as multiindex dataframe through pd.read_excel()

问题 I'm struggle to read a excel sheet with pd.read_excel() . My excel table looks like this in it's raw form: I expected the dataframe to look like this: bar baz foo one two one two one two A B C D E F baz one 0.085930 -0.848468 0.911572 -0.705026 -1.284458 -0.602760 two 0.385054 2.539314 0.589164 0.765126 0.210199 -0.481789 three -0.352475 -0.975200 -0.403591 0.975707 0.533924 -0.195430 is this even possible? My failed attempt: xls_file = pd.read_excel(data_file, header=[0,1,2], index_col=None)

Pandas: Modify a particular level of Multiindex

阅读更多关于 Pandas: Modify a particular level of Multiindex

问题 I have a dataframe with Multiindex and would like to modify one particular level of the Multiindex. For instance, the first level might be strings and I may want to remove the white spaces from that index level: df.index.levels[1] = [x.replace(' ', '') for x in df.index.levels[1]] However, the code above results in an error: TypeError: 'FrozenList' does not support mutable operations. I know I can reset_index and modify the column and then re-create the Multiindex, but I wonder whether there

How to remove levels from a multi-indexed dataframe?

阅读更多关于 How to remove levels from a multi-indexed dataframe?

问题 For example, I have: In [1]: df = pd.DataFrame([8, 9], index=pd.MultiIndex.from_tuples([(1, 1, 1), (1, 3, 2)]), columns=['A']) In [2] df Out[2]: A 1 1 1 8 3 2 9 Is there a better way to remove the last level from the index than this: In [3]: pd.DataFrame(df.values, index=df.index.droplevel(2), columns=df.columns) Out[3]: A 1 1 8 3 9 回答1: df.reset_index(level=2, drop=True) Out[29]: A 1 1 8 3 9 回答2: You don't need to create a new DataFrame instance! You can modify the index: df.index = df.index

How to move pandas data from index to column after multiple groupby

阅读更多关于 How to move pandas data from index to column after multiple groupby

I have the following pandas dataframe: dfalph.head() token year uses books 386 xanthos 1830 3 3 387 xanthos 1840 1 1 388 xanthos 1840 2 2 389 xanthos 1868 2 2 390 xanthos 1875 1 1 I aggregate the rows with duplicate token and years like so: dfalph = dfalph[['token','year','uses','books']].groupby(['token', 'year']).agg([np.sum]) dfalph.columns = dfalph.columns.droplevel(1) dfalph.head() uses books token year xanthos 1830 3 3 1840 3 3 1867 2 2 1868 2 2 1875 1 1 Instead of having the 'token' and 'year' fields in the index, I would like to return them to columns and have an integer index. Method

Turn Pandas Multi-Index into column

阅读更多关于 Turn Pandas Multi-Index into column

I have a dataframe with 2 index levels: value Trial measurement 1 0 13 1 3 2 4 2 0 NaN 1 12 3 0 34 Which I want to turn into this: Trial measurement value 1 0 13 1 1 3 1 2 4 2 0 NaN 2 1 12 3 0 34 How can I best do this? I need this because I want to aggregate the data as instructed here , but I can't select my columns like that if they are in use as indices. CraigSF The reset_index() is a pandas DataFrame method that will transfer index values into the DataFrame as columns. The default setting for the parameter is drop=False (which will keep the index values as columns). All you have to do add