multi-index | 易学教程

Merge MultiIndex columns together into 1 level [duplicate]

阅读更多关于 Merge MultiIndex columns together into 1 level [duplicate]

问题 This question already has answers here : Pandas - How to flatten a hierarchical index in columns (16 answers) Closed 2 years ago . Here's some data from another question: date type value 1/1/2016 a 1 1/1/2016 b 2 1/1/2016 a 1 1/1/2016 b 4 1/2/2016 a 1 1/2/2016 b 1 Run this line of code: x = df.groupby(['date', 'type']).value.agg(['sum', 'max']).unstack() x should look like this: sum max type a b a b date 1/1/2016 2 6 1 4 1/2/2016 1 1 1 1 I want to combine the columns on the upper and lower

Python (pandas): store a data frame in hdf5 with a multi index

阅读更多关于 Python (pandas): store a data frame in hdf5 with a multi index

问题 I need to work with large dimension data frame with multi index, so i tried to create a data frame to learn how to store it in an hdf5 file. The data frame is like this: (with the multi index in the first 2 columns) Symbol Date 0 C 2014-07-21 4792 B 2014-07-21 4492 A 2014-07-21 5681 B 2014-07-21 8310 A 2014-07-21 1197 C 2014-07-21 4722 2014-07-21 7695 2014-07-21 1774 I'm using the pandas.to_hdf but it creates a "Fixed format store", when I try to select the datas in a group: store.select(

Get numeric index from Boost multi-index iterator

阅读更多关于 Get numeric index from Boost multi-index iterator

问题 I'm storing a bunch of the following struct Article { std::string title; unsigned db_id; // id field in MediaWiki database dump }; in a Boost.MultiIndex container, defined as typedef boost::multi_index_container< Article, indexed_by< random_access<>, hashed_unique<tag<by_db_id>, member<Article, unsigned, &Article::db_id> >, hashed_unique<tag<by_title>, member<Article, std::string, &Article::title> > > > ArticleSet; Now I've got two iterators, one from index<by_title> and one from index<by_id>

Creating an empty MultiIndex

阅读更多关于 Creating an empty MultiIndex

问题 I would like to create an empty DataFrame with a MultiIndex before assigning rows to it. I already found that empty DataFrames don't like to be assigned MultiIndexes on the fly, so I'm setting the MultiIndex names during creation. However, I don't want to assign levels , as this will be done later. This is the best code I got to so far: def empty_multiindex(names): """ Creates empty MultiIndex from a list of level names. """ return MultiIndex.from_tuples(tuples=[(None,) * len(names)], names

How to get away with a multidimensional index in pandas

阅读更多关于 How to get away with a multidimensional index in pandas

问题 In Pandas, what is a good way to select sets of arbitrary rows in a multiindex? df = pd.DataFrame(columns=['A', 'B', 'C']) df['A'] = ['a', 'a', 'b', 'b'] df['B'] = [1,2,3,4] df['C'] = [1,2,3,4] the_indices_we_want = df.ix[[0,3],['A','B']] df = df.set_index(['A', 'B']) #Create a multiindex df.ix[the_indices_we_want] #ValueError: Cannot index with multidimensional key df.ix[[tuple(x) for x in the_indices_we_want.values]] This last line is an answer, but it feels clunky answer; they can't even

Append a level to a pandas MultiIndex

阅读更多关于 Append a level to a pandas MultiIndex

问题 Say I have a pandas dataframe with three indices 'a', 'b' and 'c' - how can I add a fourth index from an array and set its name to 'd' at the same time? This works: df.set_index(fourth_index, append=True, inplace=True) df.index.set_names(['a','b','c','d'], inplace=True) But I'm looking for something that doesn't require me to also name the first three indices again, e.g. (this doesn't work): df.set_index({'d': fourth_index}, append=True, inplace=True) Am I missing some function here? 回答1: Add

Filling missing time values in a multi-indexed dataframe

阅读更多关于 Filling missing time values in a multi-indexed dataframe

问题 Problem and what I want I have a data file that comprises time series read asynchronously from multiple sensors. Basically for every data element in my file, I have a sensor ID and time at which it was read, but I do not always have all sensors for every time, and read times may not be evenly spaced. Something like: ID,time,data 0,0,1 1,0,2 2,0,3 0,1,4 2,1,5 # skip some sensors for some time steps 0,2,6 2,2,7 2,3,8 1,5,9 # skip some time steps 2,5,10 Important note the actual time column is

Dropping a single (sub-) column from a MultiIndex

阅读更多关于 Dropping a single (sub-) column from a MultiIndex

问题 I have the following dataframe df col1 col2 col3 a b a b a b 1 ... 2 3 and just cannot figure out how to drop only a single 'sublevel', e.g. df.col1.a I can df.col1.drop('a', axis=1) , but reassigning it like df.col1=df.col1.drop('a', axis=1) fails. The logical structure df.colums I understand, but how should I be modifying it? 回答1: Drop is a very flexible method, and there are quite a few ways to use it: In [11]: mi = pd.MultiIndex.from_product([['col1', 'col2', 'col3'], ['a', 'b']]) In [12]

High-dimensional data structure in Python

阅读更多关于 High-dimensional data structure in Python

问题 What is best way to store and analyze high-dimensional date in python? I like Pandas DataFrame and Panel where I can easily manipulate the axis. Now I have a hyper-cube (dim >=4) of data. I have been thinking of stuffs like dict of Panels, tuple as panel entries. I wonder if there is a high-dim panel thing in Python. update 20/05/16: Thanks very much for all the answers. I have tried MultiIndex and xArray, however I am not able to comment on any of them. In my problem I will try to use

Setting values with multiindex in pandas

阅读更多关于 Setting values with multiindex in pandas

问题 There are already a couple of questions on SO relating to this, most notably this one, however none of the answers seem to work for me and quite a few links to docs (especially on lexsorting) are broken, so I'll ask another one. I'm trying do to something (seemingly) very simple. Consider the following MultiIndexed Dataframe: import pandas as pd; import random arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'], ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']] tuples =