multi-index

Selecting rows in a MultiIndex dataframe by index without losing any levels

独自空忆成欢 提交于 2019-11-28 04:41:19
问题 I would like to select a row called 'Mid', without losing it's index 'Site' Following code shows the dataframe: m.commodity price max maxperstep Site Commodity Type Mid Biomass Stock 6.0 inf inf CO2 Env 0.0 inf inf Coal Stock 7.0 inf inf Elec Demand NaN NaN NaN Gas Stock 27.0 inf inf Hydro SupIm NaN NaN NaN Lignite Stock 4.0 inf inf Slack Stock 999.0 inf inf Solar SupIm NaN NaN NaN Wind SupIm NaN NaN NaN North Biomass Stock 6.0 inf inf CO2 Env 0.0 inf inf Coal Stock 7.0 inf inf Elec Demand

using boost multi_index_container to preserve insertion order

我与影子孤独终老i 提交于 2019-11-28 04:12:08
问题 I initially started out using a std::multimap to store many values with the same key, but then I discovered that it doesn't preserve the insertion order among values with the same key. This answer claims it can be done with boost::multi_index::multi_index_container , but gives no example. Looking through the docs, there are no examples of that usage, and I can't make heads or tails of how you're supposed to use this thing. I have come to expect poor documentation from the lesser-used boost

Selecting rows from a Pandas dataframe with a compound (hierarchical) index

大兔子大兔子 提交于 2019-11-28 03:09:11
I'm suspicious that this is trivial, but I yet to discover the incantation that will let me select rows from a Pandas dataframe based on the values of a hierarchical key. So, for example, imagine we have the following dataframe: import pandas df = pandas.DataFrame({'group1': ['a','a','a','b','b','b'], 'group2': ['c','c','d','d','d','e'], 'value1': [1.1,2,3,4,5,6], 'value2': [7.1,8,9,10,11,12] }) df = df.set_index(['group1', 'group2']) df looks as we would expect: If df were not indexed on group1 I could do the following: df['group1' == 'a'] But that fails on this dataframe with an index. So

Pandas - write Multiindex rows with to_csv

强颜欢笑 提交于 2019-11-28 00:56:28
问题 I am using to_csv to write a Multiindex DataFrame to csv files. The csv file has one column that contains the multiindexes in tuples, like: ('a', 'x') ('a', 'y') ('a', 'z') ('b', 'x') ('b', 'y') ('b', 'z') However, I want to be able to output the Multiindex to two columns instead of one column of tuples, such as: a, x , y , z b, x , y , z It looks like tupleize_cols can achieve this for columns, but there is no such option for the rows. Is there a way to achieve this? 回答1: I think this will

collapse a pandas MultiIndex

不打扰是莪最后的温柔 提交于 2019-11-27 23:09:02
问题 Suppose I have a DataFrame with MultiIndex columns. How can I collapse the levels to a concatenation of the values so that I only have one level? Setup np.random.seed([3, 14]) col = pd.MultiIndex.from_product([list('ABC'), list('DE'), list('FG')]) df = pd.DataFrame(np.random.rand(4, 12) * 10, columns=col).astype(int) print df A B C D E D E D E F G F G F G F G F G F G 0 2 1 1 7 5 9 9 2 7 4 0 3 1 3 7 1 1 5 3 1 4 3 5 6 0 2 2 6 9 9 9 5 7 0 1 2 7 5 3 2 2 8 0 3 9 4 7 0 8 2 5 I want the result to

How to slice one MultiIndex DataFrame with the MultiIndex of another

主宰稳场 提交于 2019-11-27 21:19:02
I have a pandas dataframe with 3 levels of a MultiIndex. I am trying to pull out rows of this dataframe according to a list of values that correspond to two of the levels. I have something like this: ix = pd.MultiIndex.from_product([[1, 2, 3], ['foo', 'bar'], ['baz', 'can']], names=['a', 'b', 'c']) data = np.arange(len(ix)) df = pd.DataFrame(data, index=ix, columns=['hi']) print(df) hi a b c 1 foo baz 0 can 1 bar baz 2 can 3 2 foo baz 4 can 5 bar baz 6 can 7 3 foo baz 8 can 9 bar baz 10 can 11 Now I want to take all rows where index levels 'b' and 'c' are in this index: ix_use = pd.MultiIndex

Pandas dataframe with multiindex column - merge levels

孤者浪人 提交于 2019-11-27 18:01:23
I have a dataframe, grouped , with multiindex columns as below: import pandas as pd codes = ["one","two","three"]; colours = ["black", "white"]; textures = ["soft", "hard"]; N= 100 # length of the dataframe df = pd.DataFrame({ 'id' : range(1,N+1), 'weeks_elapsed' : [random.choice(range(1,25)) for i in range(1,N+1)], 'code' : [random.choice(codes) for i in range(1,N+1)], 'colour': [random.choice(colours) for i in range(1,N+1)], 'texture': [random.choice(textures) for i in range(1,N+1)], 'size': [random.randint(1,100) for i in range(1,N+1)], 'scaled_size': [random.randint(100,1000) for i in

How to remove levels from a multi-indexed dataframe?

≡放荡痞女 提交于 2019-11-27 17:48:59
For example, I have: In [1]: df = pd.DataFrame([8, 9], index=pd.MultiIndex.from_tuples([(1, 1, 1), (1, 3, 2)]), columns=['A']) In [2] df Out[2]: A 1 1 1 8 3 2 9 Is there a better way to remove the last level from the index than this: In [3]: pd.DataFrame(df.values, index=df.index.droplevel(2), columns=df.columns) Out[3]: A 1 1 8 3 9 df.reset_index(level=2, drop=True) Out[29]: A 1 1 8 3 9 Andy Hayden You don't need to create a new DataFrame instance! You can modify the index: df.index = df.index.droplevel(2) df A 1 1 8 3 9 You can also specify negative indices, for selection from the end: df

selecting from multi-index pandas

痞子三分冷 提交于 2019-11-27 17:12:02
I have a multi-index data frame with columns 'A' and 'B'. Is there is a way to select rows by filtering on one column of the multi-index without resetting the index to a single column index? For Example. # has multi-index (A,B) df #can I do this? I know this doesn't work because the index is multi-index so I need to specify a tuple df.ix[df.A ==1] One way is to use the get_level_values Index method: In [11]: df Out[11]: 0 A B 1 4 1 2 5 2 3 6 3 In [12]: df.iloc[df.index.get_level_values('A') == 1] Out[12]: 0 A B 1 4 1 In 0.13 you'll be able to use xs with drop_level argument : df.xs(1, level='A

Selecting columns from pandas MultiIndex

久未见 提交于 2019-11-27 12:28:11
I have DataFrame with MultiIndex columns that looks like this: # sample data col = pd.MultiIndex.from_arrays([['one', 'one', 'one', 'two', 'two', 'two'], ['a', 'b', 'c', 'a', 'b', 'c']]) data = pd.DataFrame(np.random.randn(4, 6), columns=col) data What is the proper, simple way of selecting only specific columns (e.g. ['a', 'c'] , not a range) from the second level? Currently I am doing it like this: import itertools tuples = [i for i in itertools.product(['one', 'two'], ['a', 'c'])] new_index = pd.MultiIndex.from_tuples(tuples) print(new_index) data.reindex_axis(new_index, axis=1) It doesn't