multi-index | 易学教程

Pandas - write Multiindex rows with to_csv

阅读更多关于 Pandas - write Multiindex rows with to_csv

I am using to_csv to write a Multiindex DataFrame to csv files. The csv file has one column that contains the multiindexes in tuples, like: ('a', 'x') ('a', 'y') ('a', 'z') ('b', 'x') ('b', 'y') ('b', 'z') However, I want to be able to output the Multiindex to two columns instead of one column of tuples, such as: a, x , y , z b, x , y , z It looks like tupleize_cols can achieve this for columns, but there is no such option for the rows. Is there a way to achieve this? I think this will do it In [3]: df = DataFrame(dict(A = 'foo', B = 'bar', value = 1),index=range(5)).set_index(['A','B']) In [4

how boost multi_index is implemented

阅读更多关于 how boost multi_index is implemented

问题 I have some difficulties understanding how Boost.MultiIndex is implemented. Lets say I have the following: typedef multi_index_container< employee, indexed_by< ordered_unique<member<employee, std::string, &employee::name> >, ordered_unique<member<employee, int, &employee::age> > > > employee_set; I imagine that I have one array, Employee[] , which actually stores the employee objects, and two maps map<std::string, employee*> map<int, employee*> with name and age as keys. Each map has employee

collapse a pandas MultiIndex

阅读更多关于 collapse a pandas MultiIndex

Suppose I have a DataFrame with MultiIndex columns. How can I collapse the levels to a concatenation of the values so that I only have one level? Setup np.random.seed([3, 14]) col = pd.MultiIndex.from_product([list('ABC'), list('DE'), list('FG')]) df = pd.DataFrame(np.random.rand(4, 12) * 10, columns=col).astype(int) print df A B C D E D E D E F G F G F G F G F G F G 0 2 1 1 7 5 9 9 2 7 4 0 3 1 3 7 1 1 5 3 1 4 3 5 6 0 2 2 6 9 9 9 5 7 0 1 2 7 5 3 2 2 8 0 3 9 4 7 0 8 2 5 I want the result to look like this: ADF ADG AEF AEG BDF BDG BEF BEG CDF CDG CEF CEG 0 2 1 1 7 5 9 9 2 7 4 0 3 1 3 7 1 1 5 3 1

Using .loc with a MultiIndex in pandas?

阅读更多关于 Using .loc with a MultiIndex in pandas?

问题 Does anyone know if it is possible to use the DataFrame.loc method to select from a MultiIndex? I have the following DataFrame and would like to be able to access the values located in the 'Dwell' columns, at the indices of ('at', 1) , ('at', 3) , ('at', 5) , and so on (non-sequential). I'd love to be able to do something like data.loc[['at',[1,3,5]], 'Dwell'] , similar to the data.loc[[1,3,5], 'Dwell'] syntax for a regular index (which returns a 3-member series of Dwell values). My purpose

Read multi-index on the columns from csv file

阅读更多关于 Read multi-index on the columns from csv file

问题 I have a .csv file that looks like this: Male, Male, Male, Female, Female R, R, L, R, R .86, .67, .88, .78, .81 I want to read that into a df, so that I have: Male Female R L R 0 .86 .67 .88 .78 .81 I did: df = pd.read_csv('file.csv', header=[0,1]) But headers does not cut it. Which results in Empty DataFrame Columns: [(Male, R), (Male, R), (Male, L), (Female, R), (Female, R)] Index: [] Yet, the docs on headers says: (...)Can be a list of integers that specify row locations for a multi-index

Summing over a multiindex level in a pandas series

阅读更多关于 Summing over a multiindex level in a pandas series

Using the Pandas package in python, I would like to sum (marginalize) over one level in a series with a 3-level multiindex to produce a series with a 2 level multiindex. For example, if I have the following: ind = [tuple(x) for x in ['ABC', 'ABc', 'AbC', 'Abc', 'aBC', 'aBc', 'abC', 'abc']] mi = pd.MultiIndex.from_tuples(ind) data = pd.Series([264, 13, 29, 8, 152, 7, 15, 1], index=mi) A B C 264 c 13 b C 29 c 8 a B C 152 c 7 b C 15 c 1 I would like to sum over the variable C to produce the following output: A B 277 b 37 a B 159 b 16 What is the best way in Pandas to do this? If you know you

Reshaping dataframes in pandas based on column labels

阅读更多关于 Reshaping dataframes in pandas based on column labels

What is the best way to reshape the following dataframe in pandas? This DataFrame df has x,y values for each sample ( s1 and s2 in this case) and looks like this: In [23]: df = pandas.DataFrame({"s1_x": scipy.randn(10), "s1_y": scipy.randn(10), "s2_x": scipy.randn(10), "s2_y": scipy.randn(10)}) In [24]: df Out[24]: s1_x s1_y s2_x s2_y 0 0.913462 0.525590 -0.377640 0.700720 1 0.723288 -0.691715 0.127153 0.180836 2 0.181631 -1.090529 -1.392552 1.530669 3 0.997414 -1.486094 1.207012 0.376120 4 -0.319841 0.195289 -1.034683 0.286073 5 1.085154 -0.619635 0.396867 0.623482 6 1.867816 -0.928101 -0

pandas: how to run a pivot with a multi-index?

阅读更多关于 pandas: how to run a pivot with a multi-index?

I would like to run a pivot on a pandas DataFrame , with the index being two columns, not one. For example, one field for the year, one for the month, an 'item' field which shows 'item 1' and 'item 2' and a 'value' field with numerical values. I want the index to be year + month. The only way I managed to get this to work was to combine the two fields into one, then separate them again. is there a better way? Minimal code copied below. Thanks a lot! PS Yes, I am aware there are other questions with the keywords 'pivot' and 'multi-index', but I did not understand if/how they can help me with

Pandas Plotting with Multi-Index

阅读更多关于 Pandas Plotting with Multi-Index

After performing a groupby.sum() on a DataFrame I'm having some trouble trying to create my intended plot. How can I create a subplot ( kind='bar' ) for each Code , where the x-axis is the Month and the bars are ColA and ColB ? Reustonium I found the unstack(level) method to work perfectly, which has the added benefit of not needing a priori knowledge about how many Codes there are. df.unstack(level=0).plot(kind='bar', subplots=True) Using the following DataFrame ... # using pandas version 0.14.1 from pandas import DataFrame import pandas as pd import matplotlib.pyplot as plt data = {'ColB': {

Sum columns by level in a Multi-Index DataFrame

阅读更多关于 Sum columns by level in a Multi-Index DataFrame

I have my df with multi-index columns. All of my values are in float, and I want to merge values with in first level of multi-index. Please see below for detail. first bar baz foo second one two one two one A 0.895717 0.805244 1.206412 2.565646 1.431256 B 0.410835 0.813850 0.132003 0.827317 0.076467 C 1.413681 1.607920 1.024180 0.569605 0.875906 first bar baz foo A (0.895717+0.805244) (1.206412+2.565646) 1.431256 B (0.410835+0.813850) (0.132003+0.827317) 0.076467 C (1.413681+1.607920) (1.024180+0.569605) 0.875906 The values are actually added (I just didn't feel like doing all this :)). Bottom