multi-index | 易学教程

Replacing values in a pandas multi-index

阅读更多关于 Replacing values in a pandas multi-index

I have a dataframe with a multi-index. I want to change the value of the 2nd index when certain conditions on the first index are met. I found a similar (but different) question here: Replace a value in MultiIndex (pandas) which doesn't answer my point because that was about changing a single row, and the solution passed the value of the first index (which didn't need changing), too. In my case I am dealing with multiple rows and I haven't been able to adapt that solution to my case. A minimal example of my data is below. Thanks! import pandas as pd import numpy as np consdf=pd.DataFrame() for

Pandas: Is there a way to use something like 'droplevel' and in process, rename the the other level using the dropped level labels as prefix/suffix?

阅读更多关于 Pandas: Is there a way to use something like 'droplevel' and in process, rename the the other level using the dropped level labels as prefix/suffix?

Screenshot of the query below: Is there a way to easily drop the upper level column index and a have a single level with labels such as points_prev_amax , points_prev_amin , gf_prev_amax , gf_prev_amin and so on? Use list comprehension for set new column names: df.columns = df.columns.map('_'.join) Or: df.columns = ['_'.join(col) for col in df.columns] Sample: df = pd.DataFrame({'A':[1,2,2,1], 'B':[4,5,6,4], 'C':[7,8,9,1], 'D':[1,3,5,9]}) print (df) A B C D 0 1 4 7 1 1 2 5 8 3 2 2 6 9 5 3 1 4 1 9 df = df.groupby('A').agg([max, min]) df.columns = df.columns.map('_'.join) print (df) B_max B_min

Pandas: Is there a way to use something like 'droplevel' and in process, rename the the other level using the dropped level labels as prefix/suffix?

阅读更多关于 Pandas: Is there a way to use something like 'droplevel' and in process, rename the the other level using the dropped level labels as prefix/suffix?

问题 Screenshot of the query below: Is there a way to easily drop the upper level column index and a have a single level with labels such as points_prev_amax , points_prev_amin , gf_prev_amax , gf_prev_amin and so on? 回答1: Use list comprehension for set new column names: df.columns = df.columns.map('_'.join) Or: df.columns = ['_'.join(col) for col in df.columns] Sample: df = pd.DataFrame({'A':[1,2,2,1], 'B':[4,5,6,4], 'C':[7,8,9,1], 'D':[1,3,5,9]}) print (df) A B C D 0 1 4 7 1 1 2 5 8 3 2 2 6 9 5

How to filter dates on multiindex dataframe

阅读更多关于 How to filter dates on multiindex dataframe

I'm looking for a way to filter a multiindex dataframe like the following by day of the week and/or selected dates. Let's say I need a query to select only mondays ; another query in which I want to select all days except monday and friday ; a third query to select data present in an input list of dates, like select all dates in ['2015-05-14', '2015-05-21', '2015-05-22'] ; and finally, a query combining selection based on day of week and a list of dates, like select all dates in ['2015-05-14', '2015-05-21', '2015-05-22'] and thursdays . What's the way to do it? Col1 Col2 Col3 Col4 Date Two

3 dimensional numpy array to multiindex pandas dataframe

阅读更多关于 3 dimensional numpy array to multiindex pandas dataframe

问题 I have a 3 dimensional numpy array, (z, x, y) . z is a time dimension and x and y are coordinates. I want to convert this to a multiindexed pandas.DataFrame . I want the row index to be the z dimension and each column to have values from a unique x, y coordinate (and so, each column would be multi-indexed). The simplest case (not multi-indexed): >>> array.shape (500L, 120L, 100L) >>> df = pd.DataFrame(array[:,0,0]) >>> df.shape (500, 1) I've been trying to pass the whole array into a

How to filter dates on multiindex dataframe

阅读更多关于 How to filter dates on multiindex dataframe

问题 I'm looking for a way to filter a multiindex dataframe like the following by day of the week and/or selected dates. Let's say I need a query to select only mondays ; another query in which I want to select all days except monday and friday ; a third query to select data present in an input list of dates, like select all dates in ['2015-05-14', '2015-05-21', '2015-05-22'] ; and finally, a query combining selection based on day of week and a list of dates, like select all dates in ['2015-05-14'

Transform Pandas DataFrame with n-level hierarchical index into n-D Numpy array

阅读更多关于 Transform Pandas DataFrame with n-level hierarchical index into n-D Numpy array

Question Is there a good way to transform a DataFrame with an n -level index into an n -D Numpy array (a.k.a n -tensor)? Example Suppose I set up a DataFrame like from pandas import DataFrame, MultiIndex index = range(2), range(3) value = range(2 * 3) frame = DataFrame(value, columns=['value'], index=MultiIndex.from_product(index)).drop((1, 0)) print frame which outputs value 0 0 0 1 1 2 3 1 1 5 2 6 The index is a 2-level hierarchical index. I can extract a 2-D Numpy array from the data using print frame.unstack().values which outputs [[ 0. 1. 2.] [ nan 4. 5.]] How does this generalize to an n

How to do Multi-Column from_tuples?

阅读更多关于 How to do Multi-Column from_tuples?

I get how to use pd.MultiIndex.from_tuples() in order to change something like Value (A,a) 1 (B,a) 2 (B,b) 3 into Value Caps Lower A a 1 B a 2 B b 3 But how do I change column tuples in the form (A, a) (A, b) (B,a) (B,b) index 1 1 2 2 3 2 2 3 3 2 3 3 4 4 1 into the form Caps A B Lower a b a b index 1 1 2 2 3 2 2 3 3 2 3 3 4 4 1 Many thanks. Edit: The reason I have a tuple column header is that when I joined a DataFrame with a single level column onto a DataFrame with a Multi-Level column it turned the Multi-Column into a tuple of strings format and left the single level as single string. Edit

Setting DataFrame column headers to a MultiIndex

阅读更多关于 Setting DataFrame column headers to a MultiIndex

How do I convert an existing dataframe with single-level columns to have hierarchical index columns (MultiIndex)? Example dataframe: In [1]: import pandas as pd from pandas import Series, DataFrame df = DataFrame(np.arange(6).reshape((2,3)), index=['A','B'], columns=['one','two','three']) df Out [1]: one two three A 0 1 2 B 3 4 5 I'd have thought that reindex() would work, but I get NaN's: In [2]: df.reindex(columns=[['odd','even','odd'],df.columns]) Out [2]: odd even odd one two three A NaN NaN NaN B NaN NaN NaN Same if I use DataFrame(): In [3]: DataFrame(df,columns=[['odd','even','odd'],df

How to get away with a multidimensional index in pandas

阅读更多关于 How to get away with a multidimensional index in pandas

In Pandas, what is a good way to select sets of arbitrary rows in a multiindex? df = pd.DataFrame(columns=['A', 'B', 'C']) df['A'] = ['a', 'a', 'b', 'b'] df['B'] = [1,2,3,4] df['C'] = [1,2,3,4] the_indices_we_want = df.ix[[0,3],['A','B']] df = df.set_index(['A', 'B']) #Create a multiindex df.ix[the_indices_we_want] #ValueError: Cannot index with multidimensional key df.ix[[tuple(x) for x in the_indices_we_want.values]] This last line is an answer, but it feels clunky answer; they can't even be lists, they have to be tuples. It also involves generating a new object to do the indexing with. I'm