multi-index | 易学教程

Setting values with multiindex in pandas

阅读更多关于 Setting values with multiindex in pandas

There are already a couple of questions on SO relating to this, most notably this one , however none of the answers seem to work for me and quite a few links to docs (especially on lexsorting) are broken, so I'll ask another one. I'm trying do to something (seemingly) very simple. Consider the following MultiIndexed Dataframe: import pandas as pd; import random arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'], ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']] tuples = list(zip(*arrays)) index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second']) df = pd

Reindex sublevel of pandas dataframe multiindex

阅读更多关于 Reindex sublevel of pandas dataframe multiindex

I have a time series dataframe and I would like to reindex it by Trials and Measurements. Simplified, I have this: value Trial 1 0 13 1 3 2 4 2 3 NaN 4 12 3 5 34 Which I want to turn into this: value Trial 1 0 13 1 3 2 4 2 0 NaN 1 12 3 0 34 How can I best do this? Dan Allan Just yesterday, the illustrious Andy Hayden added this feature to version 0.13 of pandas, which will be released any day now. See here for usage example he added to the docs. If you are comfortable installing the development version of pandas from source, you can use it now. df['Measurements'] = df.reset_index().groupby(

Merge MultiIndex columns together into 1 level [duplicate]

阅读更多关于 Merge MultiIndex columns together into 1 level [duplicate]

This question already has an answer here: Pandas - How to flatten a hierarchical index in columns 16 answers Here's some data from another question: date type value 1/1/2016 a 1 1/1/2016 b 2 1/1/2016 a 1 1/1/2016 b 4 1/2/2016 a 1 1/2/2016 b 1 Run this line of code: x = df.groupby(['date', 'type']).value.agg(['sum', 'max']).unstack() x should look like this: sum max type a b a b date 1/1/2016 2 6 1 4 1/2/2016 1 1 1 1 I want to combine the columns on the upper and lower level to get this: sum_a sum_b max_a max_b date 1/1/2016 2 6 1 4 1/2/2016 1 1 1 1 Is there an easy way to do this? greg_data

Using boolean indexing for row and column MultiIndex in Pandas

阅读更多关于 Using boolean indexing for row and column MultiIndex in Pandas

Questions are at the end, in bold . But first, let's set up some data: import numpy as np import pandas as pd from itertools import product np.random.seed(1) team_names = ['Yankees', 'Mets', 'Dodgers'] jersey_numbers = [35, 71, 84] game_numbers = [1, 2] observer_names = ['Bill', 'John', 'Ralph'] observation_types = ['Speed', 'Strength'] row_indices = list(product(team_names, jersey_numbers, game_numbers, observer_names, observation_types)) observation_values = np.random.randn(len(row_indices)) tns, jns, gns, ons, ots = zip(*row_indices) data = pd.DataFrame({'team': tns, 'jersey': jns, 'game':

Pandas Multiindex Groupby on Columns

阅读更多关于 Pandas Multiindex Groupby on Columns

问题 Is there anyway to use groupby on the columns in a Multiindex. I know you can on the rows and there is good documentation in that regard. However I cannot seem to groupby on columns. The only solution I have is transposing the dataframe. #generate data (copied from pandas example) arrays=[['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']] tuples = list(zip(*arrays)) index = pd.MultiIndex.from_tuples(tuples, names=['first',

In Pandas How to sort one level of a multi-index based on the values of a column, while maintaining the grouping of the other level

阅读更多关于 In Pandas How to sort one level of a multi-index based on the values of a column, while maintaining the grouping of the other level

问题 I'm taking a Data Mining course at university right now, but I'm a wee bit stuck on a multi-index sorting problem. The actual data involves about 1 million reviews of movies, and I'm trying to analyze that based on American zip codes, but to test out how to do what I want, I've been using a much smaller data set of 250 randomly generated ratings for 10 movies and instead of zip codes, I'm using age groups. So this is what I have right now, it's a multiindexed DataFrame in Pandas with two

How to properly pivot or reshape a timeseries dataframe in Pandas?

阅读更多关于 How to properly pivot or reshape a timeseries dataframe in Pandas?

I need to reshape a dataframe that looks like df1 and turn it into df2. There are 2 considerations for this procedure: I need to be able to set the number of rows to be sliced as a parameter (length). I need to split date and time from the index, and use date in the reshape as the column names and keep time as the index. Current df1 2007-08-07 18:00:00 1 2007-08-08 00:00:00 2 2007-08-08 06:00:00 3 2007-08-08 12:00:00 4 2007-08-08 18:00:00 5 2007-11-02 18:00:00 6 2007-11-03 00:00:00 7 2007-11-03 06:00:00 8 2007-11-03 12:00:00 9 2007-11-03 18:00:00 10 Desired Output df2 - With the parameter

Is there an equivalent of boost::multi_index for Java someplace?

阅读更多关于 Is there an equivalent of boost::multi_index for Java someplace?

问题 I stumbled upon multi_index on a lark last night while pounding my head against a collection that I need to access by 3 different key values, and also to have rebalancing array semantics. Well, I got one of my two wishes (3 different key values) in boost::multi_index . Does anything similar exist in the Java world? 回答1: I have just finished MultiIndexContainer in Java: http://code.google.com/p/multiindexcontainer/wiki/MainPage. I know that it is not complete equivalent of boost multi_index

Convert MultiIndex DataFrame to Series

阅读更多关于 Convert MultiIndex DataFrame to Series

I created a multiIndex DataFrame by: df.set_index(['Field1', 'Field2'], inplace=True) If this is not a multiIndex DataFrame please tell me how to make one. I want to: Group by the same columns that are in the index Aggregate a count of each group Then return the whole thing as a Series with Field1 and Field2 as the index How do I go about doing this? ADDITIONAL INFO I have a multiIndex dataFrame that looks like this: Continent Sector Count Asia 1 4 2 1 Australia 1 1 Europe 1 1 2 3 3 2 North America 1 1 5 1 South America 5 1 How can I return this as a Series with the index of [Continent, Sector

Multi-Indexed fillna in Pandas

阅读更多关于 Multi-Indexed fillna in Pandas

I have a multi-indexed dataframe and I'm looking to backfill missing values within a group. The dataframe I have currently looks like this: df = pd.DataFrame({ 'group': ['group_a'] * 7 + ['group_b'] * 3 + ['group_c'] * 2, 'Date': ["2013-06-11", "2013-07-02", "2013-07-09", "2013-07-30", "2013-08-06", "2013-09-03", "2013-10-01", "2013-07-09", "2013-08-06", "2013-09-03", "2013-07-09", "2013-09-03"], 'Value': [np.nan, np.nan, np.nan, 9, 4, 40, 18, np.nan, np.nan, 5, np.nan, 2]}) df.Date = df['Date'].apply(lambda x: pd.to_datetime(x).date()) df = df.set_index(['group', 'Date']) I'm trying to get a