multi-index | 易学教程

Converting a pandas MultiIndex DataFrame from rows-wise to column-wise

阅读更多关于 Converting a pandas MultiIndex DataFrame from rows-wise to column-wise

I'm working in zipline and pandas and have converted a pandas.Panel to a pandas.DataFrame using the to_frame() method. This is the resulting pandas.DataFrame which as you can see is multi-indexed: price major minor 2008-01-03 00:00:00+00:00 SPY 129.93 KO 26.38 PEP 64.78 2008-01-04 00:00:00+00:00 SPY 126.74 KO 26.43 PEP 64.59 2008-01-07 00:00:00+00:00 SPY 126.63 KO 27.05 PEP 66.10 2008-01-08 00:00:00+00:00 SPY 124.59 KO 27.16 PEP 66.63 I need to convert this frame to look like this: SPY KO PEP 2008-01-03 00:00:00+00:00 129.93 26.38 64.78 2008-01-04 00:00:00+00:00 126.74 26.43 64.59 2008-01-07

In Pandas How to sort one level of a multi-index based on the values of a column, while maintaining the grouping of the other level

阅读更多关于 In Pandas How to sort one level of a multi-index based on the values of a column, while maintaining the grouping of the other level

I'm taking a Data Mining course at university right now, but I'm a wee bit stuck on a multi-index sorting problem. The actual data involves about 1 million reviews of movies, and I'm trying to analyze that based on American zip codes, but to test out how to do what I want, I've been using a much smaller data set of 250 randomly generated ratings for 10 movies and instead of zip codes, I'm using age groups. So this is what I have right now, it's a multiindexed DataFrame in Pandas with two levels, 'group' and 'title' rating group title Alien 4.000000 Argo 2.166667 Adults Ben-Hur 3.666667 Gandhi

pandas: convert index type in multiindex dataframe

阅读更多关于 pandas: convert index type in multiindex dataframe

问题 Hi have a multiindex dataframe: tuples = [('YTA_Q3', 1), ('YTA_Q3', 2), ('YTA_Q3', 3), ('YTA_Q3', 4), ('YTA_Q3', 99), ('YTA_Q3', 96)] # Index index = pd.MultiIndex.from_tuples(tuples, names=['Questions', 'Values']) # Columns columns = pd.MultiIndex.from_tuples([('YTA_Q3', '@')], names=['Questions', 'Values']) # Data data = [29.014949,5.0260590000000001, 6.6269119999999999, 1.3565260000000001, 41.632221999999999, 21.279499999999999] df1 = pd.DataFrame(data=data, index=index, columns=columns)

Is there an equivalent of boost::multi_index for Java someplace?

阅读更多关于 Is there an equivalent of boost::multi_index for Java someplace?

I stumbled upon multi_index on a lark last night while pounding my head against a collection that I need to access by 3 different key values, and also to have rebalancing array semantics. Well, I got one of my two wishes (3 different key values) in boost::multi_index . Does anything similar exist in the Java world? I have just finished MultiIndexContainer in Java: http://code.google.com/p/multiindexcontainer/wiki/MainPage . I know that it is not complete equivalent of boost multi_index_container but maybe it could be sufficient for your requirement. npgall Resurrecting an old question, but

Assign new values to slice from MultiIndex DataFrame

阅读更多关于 Assign new values to slice from MultiIndex DataFrame

问题 I would like to modify some values from a column in my DataFrame. At the moment I have a view from select via the multi index of my original df (and modifying does change df ). Here's an example: In [1]: arrays = [np.array(['bar', 'bar', 'baz', 'qux', 'qux', 'bar']), np.array(['one', 'two', 'one', 'one', 'two', 'one']), np.arange(0, 6, 1)] In [2]: df = pd.DataFrame(randn(6, 3), index=arrays, columns=['A', 'B', 'C']) In [3]: df A B C bar one 0 -0.088671 1.902021 -0.540959 two 1 0.782919 -0

Summing over a multiindex level in a pandas series

阅读更多关于 Summing over a multiindex level in a pandas series

问题 Using the Pandas package in python, I would like to sum (marginalize) over one level in a series with a 3-level multiindex to produce a series with a 2 level multiindex. For example, if I have the following: ind = [tuple(x) for x in ['ABC', 'ABc', 'AbC', 'Abc', 'aBC', 'aBc', 'abC', 'abc']] mi = pd.MultiIndex.from_tuples(ind) data = pd.Series([264, 13, 29, 8, 152, 7, 15, 1], index=mi) A B C 264 c 13 b C 29 c 8 a B C 152 c 7 b C 15 c 1 I would like to sum over the variable C to produce the

When to use multiindexing vs. xarray in pandas

阅读更多关于 When to use multiindexing vs. xarray in pandas

问题 The pandas pivot tables documentation seems to recomend dealing with more than two dimensions of data by using multiindexing: In [1]: import pandas as pd In [2]: import numpy as np In [3]: import pandas.util.testing as tm; tm.N = 3 In [4]: def unpivot(frame): ...: N, K = frame.shape ...: data = {'value' : frame.values.ravel('F'), ...: 'variable' : np.asarray(frame.columns).repeat(N), ...: 'date' : np.tile(np.asarray(frame.index), K)} ...: return pd.DataFrame(data, columns=['date', 'variable',

Creating an empty MultiIndex

阅读更多关于 Creating an empty MultiIndex

I would like to create an empty DataFrame with a MultiIndex before assigning rows to it. I already found that empty DataFrames don't like to be assigned MultiIndexes on the fly, so I'm setting the MultiIndex names during creation. However, I don't want to assign levels , as this will be done later. This is the best code I got to so far: def empty_multiindex(names): """ Creates empty MultiIndex from a list of level names. """ return MultiIndex.from_tuples(tuples=[(None,) * len(names)], names=names) Which gives me In [2]: empty_multiindex(['one','two', 'three']) Out[2]: MultiIndex(levels=[[], []

add a field in pandas dataframe with MultiIndex columns

阅读更多关于 add a field in pandas dataframe with MultiIndex columns

i have looked for an answer to this question as it seems pretty simple, but have not been able to find anything yet. Apologies if I missed something. I have pandas version 0.10.0 and I have been experimenting with data of the following form: import pandas import numpy as np import datetime start_date = datetime.datetime(2009,3,1,6,29,59) r = pandas.date_range(start_date, periods=12) cols_1 = ['AAPL', 'AAPL', 'GOOG', 'GOOG', 'GS', 'GS'] cols_2 = ['close', 'rate', 'close', 'rate', 'close', 'rate'] dat = np.random.randn(12, 6) cols = pandas.MultiIndex.from_arrays([cols_1, cols_2], names=['ticker'

pandas: convert index type in multiindex dataframe

阅读更多关于 pandas: convert index type in multiindex dataframe

Hi have a multiindex dataframe: tuples = [('YTA_Q3', 1), ('YTA_Q3', 2), ('YTA_Q3', 3), ('YTA_Q3', 4), ('YTA_Q3', 99), ('YTA_Q3', 96)] # Index index = pd.MultiIndex.from_tuples(tuples, names=['Questions', 'Values']) # Columns columns = pd.MultiIndex.from_tuples([('YTA_Q3', '@')], names=['Questions', 'Values']) # Data data = [29.014949,5.0260590000000001, 6.6269119999999999, 1.3565260000000001, 41.632221999999999, 21.279499999999999] df1 = pd.DataFrame(data=data, index=index, columns=columns) How do I convert the inner values of the df's index to str? My attempt: df1.index.astype(str) returns a