multi-index

Giving a column multiple indexes/headers

本秂侑毒 提交于 2019-11-27 01:29:52
问题 I am working with pandas dataframes that are essentially time series like this: level Date 1976-01-01 409.67 1976-02-01 409.58 1976-03-01 409.66 … What I want to have, is multiple indexes/headers for the level column, like so: Station1 #Name of the datasource 43.1977317,-4.6473648,5 #Lat/Lon of the source Precip #Type of data Date 1976-01-01 409.67 1976-02-01 409.58 1976-03-01 409.66 … So essentially I am searching for something like Mydata.columns.level1 = ['Station1'] , Mydata.columns

Pandas : Proper way to set values based on condition for subset of multiindex dataframe

爱⌒轻易说出口 提交于 2019-11-27 01:08:41
问题 I'm not sure of how to do this without chained assignments (which probably wouldn't work anyways because I'd be setting a copy). I wan't to take a subset of a multiindex pandas dataframe, test for values less than zero and set them to zero. For example: df = pd.DataFrame({('A','a'): [-1,-1,0,10,12], ('A','b'): [0,1,2,3,-1], ('B','a'): [-20,-10,0,10,20], ('B','b'): [-200,-100,0,100,200]}) df[df['A']<0] = 0.0 gives In [37]: df Out[37]: A B a b a b 0 -1 0 -20 -200 1 -1 1 -10 -100 2 0 2 0 0 3 10

Nested dictionary to multiindex dataframe where dictionary keys are column labels

混江龙づ霸主 提交于 2019-11-27 00:26:45
Say I have a dictionary that looks like this: dictionary = {'A' : {'a': [1,2,3,4,5], 'b': [6,7,8,9,1]}, 'B' : {'a': [2,3,4,5,6], 'b': [7,8,9,1,2]}} and I want a dataframe that looks something like this: A B a b a b 0 1 6 2 7 1 2 7 3 8 2 3 8 4 9 3 4 9 5 1 4 5 1 6 2 Is there a convenient way to do this? If I try: In [99]: DataFrame(dictionary) Out[99]: A B a [1, 2, 3, 4, 5] [2, 3, 4, 5, 6] b [6, 7, 8, 9, 1] [7, 8, 9, 1, 2] I get a dataframe where each element is a list. What I need is a multiindex where each level corresponds to the keys in the nested dict and the rows corresponding to each

Selecting rows from a Pandas dataframe with a compound (hierarchical) index

柔情痞子 提交于 2019-11-26 23:57:45
问题 I'm suspicious that this is trivial, but I yet to discover the incantation that will let me select rows from a Pandas dataframe based on the values of a hierarchical key. So, for example, imagine we have the following dataframe: import pandas df = pandas.DataFrame({'group1': ['a','a','a','b','b','b'], 'group2': ['c','c','d','d','d','e'], 'value1': [1.1,2,3,4,5,6], 'value2': [7.1,8,9,10,11,12] }) df = df.set_index(['group1', 'group2']) df looks as we would expect: If df were not indexed on

Pandas pivot table for multiple columns at once

泪湿孤枕 提交于 2019-11-26 20:47:06
问题 Let's say I have a DataFrame: nj ptype wd wpt 0 2 1 2 1 1 3 2 1 2 2 1 1 3 1 3 2 2 3 3 4 3 1 2 2 I would like to aggregate this data using ptype as the index like so: nj wd wpt 1.0 2.0 3.0 1.0 2.0 3.0 1.0 2.0 3.0 ptype 1 1 1 1 0 2 1 2 1 0 2 0 1 1 1 0 1 0 1 1 You could build each one of the top level columns for the final value by creating a pivot table with aggfunc='count' and then concatenating them all, like so: nj = df.pivot_table(index='ptype', columns='nj', aggfunc='count').ix[:, 'wd']

reading excel sheet as multiindex dataframe through pd.read_excel()

对着背影说爱祢 提交于 2019-11-26 20:12:38
问题 I'm struggle to read a excel sheet with pd.read_excel() . My excel table looks like this in it's raw form: I expected the dataframe to look like this: bar baz foo one two one two one two A B C D E F baz one 0.085930 -0.848468 0.911572 -0.705026 -1.284458 -0.602760 two 0.385054 2.539314 0.589164 0.765126 0.210199 -0.481789 three -0.352475 -0.975200 -0.403591 0.975707 0.533924 -0.195430 is this even possible? My failed attempt: xls_file = pd.read_excel(data_file, header=[0,1,2], index_col=None)

Pandas: Modify a particular level of Multiindex

你说的曾经没有我的故事 提交于 2019-11-26 19:58:21
问题 I have a dataframe with Multiindex and would like to modify one particular level of the Multiindex. For instance, the first level might be strings and I may want to remove the white spaces from that index level: df.index.levels[1] = [x.replace(' ', '') for x in df.index.levels[1]] However, the code above results in an error: TypeError: 'FrozenList' does not support mutable operations. I know I can reset_index and modify the column and then re-create the Multiindex, but I wonder whether there

How to remove levels from a multi-indexed dataframe?

拟墨画扇 提交于 2019-11-26 19:05:39
问题 For example, I have: In [1]: df = pd.DataFrame([8, 9], index=pd.MultiIndex.from_tuples([(1, 1, 1), (1, 3, 2)]), columns=['A']) In [2] df Out[2]: A 1 1 1 8 3 2 9 Is there a better way to remove the last level from the index than this: In [3]: pd.DataFrame(df.values, index=df.index.droplevel(2), columns=df.columns) Out[3]: A 1 1 8 3 9 回答1: df.reset_index(level=2, drop=True) Out[29]: A 1 1 8 3 9 回答2: You don't need to create a new DataFrame instance! You can modify the index: df.index = df.index

How to move pandas data from index to column after multiple groupby

被刻印的时光 ゝ 提交于 2019-11-26 18:53:39
I have the following pandas dataframe: dfalph.head() token year uses books 386 xanthos 1830 3 3 387 xanthos 1840 1 1 388 xanthos 1840 2 2 389 xanthos 1868 2 2 390 xanthos 1875 1 1 I aggregate the rows with duplicate token and years like so: dfalph = dfalph[['token','year','uses','books']].groupby(['token', 'year']).agg([np.sum]) dfalph.columns = dfalph.columns.droplevel(1) dfalph.head() uses books token year xanthos 1830 3 3 1840 3 3 1867 2 2 1868 2 2 1875 1 1 Instead of having the 'token' and 'year' fields in the index, I would like to return them to columns and have an integer index. Method

Turn Pandas Multi-Index into column

僤鯓⒐⒋嵵緔 提交于 2019-11-26 18:24:17
I have a dataframe with 2 index levels: value Trial measurement 1 0 13 1 3 2 4 2 0 NaN 1 12 3 0 34 Which I want to turn into this: Trial measurement value 1 0 13 1 1 3 1 2 4 2 0 NaN 2 1 12 3 0 34 How can I best do this? I need this because I want to aggregate the data as instructed here , but I can't select my columns like that if they are in use as indices. CraigSF The reset_index() is a pandas DataFrame method that will transfer index values into the DataFrame as columns. The default setting for the parameter is drop=False (which will keep the index values as columns). All you have to do add