multi-index

When to use multiindexing vs. xarray in pandas

前提是你 提交于 2019-12-02 19:40:46
The pandas pivot tables documentation seems to recomend dealing with more than two dimensions of data by using multiindexing: In [1]: import pandas as pd In [2]: import numpy as np In [3]: import pandas.util.testing as tm; tm.N = 3 In [4]: def unpivot(frame): ...: N, K = frame.shape ...: data = {'value' : frame.values.ravel('F'), ...: 'variable' : np.asarray(frame.columns).repeat(N), ...: 'date' : np.tile(np.asarray(frame.index), K)} ...: return pd.DataFrame(data, columns=['date', 'variable', 'value']) ...: In [5]: df = unpivot(tm.makeTimeDataFrame()) In [6]: df Out[6]: date variable value

Can Pandas Read Excel's Group Structure into a MultIndex?

耗尽温柔 提交于 2019-12-02 10:46:19
问题 I have an Excel file with some (mostly) nicely grouped rows. I built a fake example below. Is there a way to get read_excel in Pandas to produce a multiindex preserving this structure? For this example the MultiIndex would have four levels (Family, Individual, Child (optional), investment). If the subtotal values were lost that would be fine as they can easily be recreated in Pandas. 回答1: No, pandas can't read such a structure. An alternative solution is to use pandas to read your data, but

How to Update Value in First N Rows by Group in a Multi-Index Pandas Dataframe?

拜拜、爱过 提交于 2019-12-02 06:47:44
问题 I am attempting to update the first N rows in a multi-index dataframe but was having a bit of trouble finding a solution so thought I'd create a post for it. The example code is as follows: # Imports import numpy as np import pandas as pd # Set Up Data Frame dates = pd.date_range('1/1/2000', periods=8) df = pd.DataFrame(np.random.randn(8, 4), columns=['A', 'B', 'C', 'D']) df['DATE'] = dates df['CATEGORY'] = ['A','B','A','B','A','B','A','B'] # Set Index df.set_index(['CATEGORY','DATE'],inplace

Slice pandas multiindex dataframe using list of index values [duplicate]

你说的曾经没有我的故事 提交于 2019-12-02 05:01:43
This question already has an answer here: Select rows in pandas MultiIndex DataFrame 1 answer I have a multi-index dataframe that looks like uid tid text abc x t1 bcd y t2 uid and tid are the indexes. I have a list of uid s, and want to get the rows corresponding to the uids in that list, but keeping the 2nd level index values (tid). I want to do it without running any explicit loop. Is that possible? Data: L = ['abc', 'bcd'] print (df) text uid tid abc x t1 abc1 x t1 bcd y t2 1. slicers idx = pd.IndexSlice df1 = df.loc[idx[L,:],:] 2. boolean indexing + mask with get_level_values + isin : df1

Can Pandas Read Excel's Group Structure into a MultIndex?

馋奶兔 提交于 2019-12-02 04:28:06
I have an Excel file with some (mostly) nicely grouped rows. I built a fake example below. Is there a way to get read_excel in Pandas to produce a multiindex preserving this structure? For this example the MultiIndex would have four levels (Family, Individual, Child (optional), investment). If the subtotal values were lost that would be fine as they can easily be recreated in Pandas. No, pandas can't read such a structure. An alternative solution is to use pandas to read your data, but transform this into an easily accessible dictionary, rather than keeping your data in a dataframe with

Concatenate dataframes with multi-index in pandas dataframe

核能气质少年 提交于 2019-12-02 03:37:26
问题 I have two dataframes df1 and df2 : In [56]: df1.head() Out[56]: col7 col8 col9 alpha0 D0 alpha0 D0 alpha0 D0 F35_HC_531d.dat 1.103999 1.103999 1.364399 1.358938 3.171808 1.946894 F35_HC_532d.dat 0.000000 0.000000 1.636934 1.635594 4.359431 2.362530 F35_HC_533d.dat 0.826599 0.826599 1.463956 1.390134 3.860629 2.199387 F35_HC_534d.dat 1.055350 1.020555 3.112200 2.498257 3.394307 2.090668 F52_HC_472d.dat 3.808008 2.912733 3.594062 2.336720 3.027449 2.216112 In [62]: df2.head() Out[62]: col7

Column selection with iloc, with both individual indices and ranges

半腔热情 提交于 2019-12-02 02:12:47
问题 I wonder why this line returns "invalid syntax", and what's the correct syntax to use for selecting both isolated columns and ranges in one go: X = f1.iloc[:, [2,5,[10:19]]].values Btw the same happens with: X = f1.iloc[:, [2,5,10:19]].values Thanks. 回答1: Second is correct syntax, only need numpy.r_ for concanecate indices: np.random.seed(2019) f1 = pd.DataFrame(np.random.randint(10, size=(5, 25))).add_prefix('a') print(f1) a0 a1 a2 a3 a4 a5 ... a19 a20 a21 a22 a23 a24 0 8 2 5 8 6 8 ... 0 1 6

Pandas Get All Values from Multiindex levels

十年热恋 提交于 2019-12-02 01:59:38
问题 Given the following pivot table: df=pd.DataFrame({'A':['a','a','a','a','a','b','b','b','b'], 'B':['x','y','z','x','y','z','x','y','z'], 'C':['a','b','a','b','a','b','a','b','a'], 'D':[7,5,3,4,1,6,5,3,1]}) table = pd.pivot_table(df, index=['A', 'B','C'],aggfunc='sum') table D A B C a x a 7 b 4 y a 1 b 5 z a 3 b x a 5 y b 3 z a 1 b 6 I'd like to access each value of 'C' (or level 2) as a list to use for plotting. I'd like to do the same for 'A' and 'B' (levels 0 and 1) in such a way that it

How to Update Value in First N Rows by Group in a Multi-Index Pandas Dataframe?

我与影子孤独终老i 提交于 2019-12-02 00:46:39
I am attempting to update the first N rows in a multi-index dataframe but was having a bit of trouble finding a solution so thought I'd create a post for it. The example code is as follows: # Imports import numpy as np import pandas as pd # Set Up Data Frame dates = pd.date_range('1/1/2000', periods=8) df = pd.DataFrame(np.random.randn(8, 4), columns=['A', 'B', 'C', 'D']) df['DATE'] = dates df['CATEGORY'] = ['A','B','A','B','A','B','A','B'] # Set Index df.set_index(['CATEGORY','DATE'],inplace=True) df.sort(inplace=True) # Get First Two Rows of Each Category df.groupby(level=0).apply(lambda x:

Pandas Get All Values from Multiindex levels

懵懂的女人 提交于 2019-12-02 00:12:49
Given the following pivot table: df=pd.DataFrame({'A':['a','a','a','a','a','b','b','b','b'], 'B':['x','y','z','x','y','z','x','y','z'], 'C':['a','b','a','b','a','b','a','b','a'], 'D':[7,5,3,4,1,6,5,3,1]}) table = pd.pivot_table(df, index=['A', 'B','C'],aggfunc='sum') table D A B C a x a 7 b 4 y a 1 b 5 z a 3 b x a 5 y b 3 z a 1 b 6 I'd like to access each value of 'C' (or level 2) as a list to use for plotting. I'd like to do the same for 'A' and 'B' (levels 0 and 1) in such a way that it preserves spacing so that I can use those lists as well. I'm ultimately trying to use them to create