multi-index

Converting a pandas MultiIndex DataFrame from rows-wise to column-wise

穿精又带淫゛_ 提交于 2019-12-03 12:49:59
I'm working in zipline and pandas and have converted a pandas.Panel to a pandas.DataFrame using the to_frame() method. This is the resulting pandas.DataFrame which as you can see is multi-indexed: price major minor 2008-01-03 00:00:00+00:00 SPY 129.93 KO 26.38 PEP 64.78 2008-01-04 00:00:00+00:00 SPY 126.74 KO 26.43 PEP 64.59 2008-01-07 00:00:00+00:00 SPY 126.63 KO 27.05 PEP 66.10 2008-01-08 00:00:00+00:00 SPY 124.59 KO 27.16 PEP 66.63 I need to convert this frame to look like this: SPY KO PEP 2008-01-03 00:00:00+00:00 129.93 26.38 64.78 2008-01-04 00:00:00+00:00 126.74 26.43 64.59 2008-01-07

In Pandas How to sort one level of a multi-index based on the values of a column, while maintaining the grouping of the other level

我们两清 提交于 2019-12-03 12:43:05
I'm taking a Data Mining course at university right now, but I'm a wee bit stuck on a multi-index sorting problem. The actual data involves about 1 million reviews of movies, and I'm trying to analyze that based on American zip codes, but to test out how to do what I want, I've been using a much smaller data set of 250 randomly generated ratings for 10 movies and instead of zip codes, I'm using age groups. So this is what I have right now, it's a multiindexed DataFrame in Pandas with two levels, 'group' and 'title' rating group title Alien 4.000000 Argo 2.166667 Adults Ben-Hur 3.666667 Gandhi

pandas: convert index type in multiindex dataframe

久未见 提交于 2019-12-03 11:40:10
问题 Hi have a multiindex dataframe: tuples = [('YTA_Q3', 1), ('YTA_Q3', 2), ('YTA_Q3', 3), ('YTA_Q3', 4), ('YTA_Q3', 99), ('YTA_Q3', 96)] # Index index = pd.MultiIndex.from_tuples(tuples, names=['Questions', 'Values']) # Columns columns = pd.MultiIndex.from_tuples([('YTA_Q3', '@')], names=['Questions', 'Values']) # Data data = [29.014949,5.0260590000000001, 6.6269119999999999, 1.3565260000000001, 41.632221999999999, 21.279499999999999] df1 = pd.DataFrame(data=data, index=index, columns=columns)

Is there an equivalent of boost::multi_index for Java someplace?

微笑、不失礼 提交于 2019-12-03 11:39:43
I stumbled upon multi_index on a lark last night while pounding my head against a collection that I need to access by 3 different key values, and also to have rebalancing array semantics. Well, I got one of my two wishes (3 different key values) in boost::multi_index . Does anything similar exist in the Java world? I have just finished MultiIndexContainer in Java: http://code.google.com/p/multiindexcontainer/wiki/MainPage . I know that it is not complete equivalent of boost multi_index_container but maybe it could be sufficient for your requirement. npgall Resurrecting an old question, but

Assign new values to slice from MultiIndex DataFrame

元气小坏坏 提交于 2019-12-03 11:23:57
问题 I would like to modify some values from a column in my DataFrame. At the moment I have a view from select via the multi index of my original df (and modifying does change df ). Here's an example: In [1]: arrays = [np.array(['bar', 'bar', 'baz', 'qux', 'qux', 'bar']), np.array(['one', 'two', 'one', 'one', 'two', 'one']), np.arange(0, 6, 1)] In [2]: df = pd.DataFrame(randn(6, 3), index=arrays, columns=['A', 'B', 'C']) In [3]: df A B C bar one 0 -0.088671 1.902021 -0.540959 two 1 0.782919 -0

Summing over a multiindex level in a pandas series

点点圈 提交于 2019-12-03 06:36:56
问题 Using the Pandas package in python, I would like to sum (marginalize) over one level in a series with a 3-level multiindex to produce a series with a 2 level multiindex. For example, if I have the following: ind = [tuple(x) for x in ['ABC', 'ABc', 'AbC', 'Abc', 'aBC', 'aBc', 'abC', 'abc']] mi = pd.MultiIndex.from_tuples(ind) data = pd.Series([264, 13, 29, 8, 152, 7, 15, 1], index=mi) A B C 264 c 13 b C 29 c 8 a B C 152 c 7 b C 15 c 1 I would like to sum over the variable C to produce the

When to use multiindexing vs. xarray in pandas

北城余情 提交于 2019-12-03 06:16:39
问题 The pandas pivot tables documentation seems to recomend dealing with more than two dimensions of data by using multiindexing: In [1]: import pandas as pd In [2]: import numpy as np In [3]: import pandas.util.testing as tm; tm.N = 3 In [4]: def unpivot(frame): ...: N, K = frame.shape ...: data = {'value' : frame.values.ravel('F'), ...: 'variable' : np.asarray(frame.columns).repeat(N), ...: 'date' : np.tile(np.asarray(frame.index), K)} ...: return pd.DataFrame(data, columns=['date', 'variable',

Creating an empty MultiIndex

﹥>﹥吖頭↗ 提交于 2019-12-03 05:42:42
I would like to create an empty DataFrame with a MultiIndex before assigning rows to it. I already found that empty DataFrames don't like to be assigned MultiIndexes on the fly, so I'm setting the MultiIndex names during creation. However, I don't want to assign levels , as this will be done later. This is the best code I got to so far: def empty_multiindex(names): """ Creates empty MultiIndex from a list of level names. """ return MultiIndex.from_tuples(tuples=[(None,) * len(names)], names=names) Which gives me In [2]: empty_multiindex(['one','two', 'three']) Out[2]: MultiIndex(levels=[[], []

add a field in pandas dataframe with MultiIndex columns

岁酱吖の 提交于 2019-12-03 02:51:23
i have looked for an answer to this question as it seems pretty simple, but have not been able to find anything yet. Apologies if I missed something. I have pandas version 0.10.0 and I have been experimenting with data of the following form: import pandas import numpy as np import datetime start_date = datetime.datetime(2009,3,1,6,29,59) r = pandas.date_range(start_date, periods=12) cols_1 = ['AAPL', 'AAPL', 'GOOG', 'GOOG', 'GS', 'GS'] cols_2 = ['close', 'rate', 'close', 'rate', 'close', 'rate'] dat = np.random.randn(12, 6) cols = pandas.MultiIndex.from_arrays([cols_1, cols_2], names=['ticker'

pandas: convert index type in multiindex dataframe

无人久伴 提交于 2019-12-03 02:14:36
Hi have a multiindex dataframe: tuples = [('YTA_Q3', 1), ('YTA_Q3', 2), ('YTA_Q3', 3), ('YTA_Q3', 4), ('YTA_Q3', 99), ('YTA_Q3', 96)] # Index index = pd.MultiIndex.from_tuples(tuples, names=['Questions', 'Values']) # Columns columns = pd.MultiIndex.from_tuples([('YTA_Q3', '@')], names=['Questions', 'Values']) # Data data = [29.014949,5.0260590000000001, 6.6269119999999999, 1.3565260000000001, 41.632221999999999, 21.279499999999999] df1 = pd.DataFrame(data=data, index=index, columns=columns) How do I convert the inner values of the df's index to str? My attempt: df1.index.astype(str) returns a