pandas-groupby

Apply rolling function on pandas dataframe with multiple arguments

假如想象 提交于 2021-02-19 16:35:53
问题 I am trying to apply a rolling function, with a 3 year window, on a pandas dataframe. import pandas as pd # Dummy data df = pd.DataFrame({'Product': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], 'Year': [2015, 2016, 2017, 2018, 2015, 2016, 2017, 2018], 'IB': [2, 5, 8, 10, 7, 5, 10, 14], 'OB': [5, 8, 10, 12, 5, 10, 14, 20], 'Delta': [2, 2, 1, 3, -1, 3, 2, 4]}) # The function to be applied def get_ln_rate(ib, ob, delta): n_years = len(ib) return sum(delta)*np.log(ob[-1]/ib[0]) / (n_years * (ob[-1]

Forward fill column with an index-based limit

倾然丶 夕夏残阳落幕 提交于 2021-02-19 02:55:07
问题 I want to forward fill a column and I want to specify a limit, but I want the limit to be based on the index---not a simple number of rows like limit allows. For example, say I have the dataframe given by: df = pd.DataFrame({ 'data': [0.0, 1.0, np.nan, 3.0, np.nan, 5.0, np.nan, np.nan, np.nan, np.nan], 'group': [0, 0, 0, 1, 1, 0, 0, 0, 1, 1] }) which looks like In [27]: df Out[27]: data group 0 0.0 0 1 1.0 0 2 NaN 0 3 3.0 1 4 NaN 1 5 5.0 0 6 NaN 0 7 NaN 0 8 NaN 1 9 NaN 1 If I group by the

Pandas: how to get a particular group after groupby? [duplicate]

与世无争的帅哥 提交于 2021-02-18 12:14:45
问题 This question already has answers here : How to access pandas groupby dataframe by key (5 answers) Closed 6 years ago . I want to group a dataframe by a column, called 'A', and inspect a particular group. grouped = df.groupby('A', sort=False) However, I don't know how to access a group, for example, I expect that grouped.first() would give me the first group Or grouped['foo'] would give me the group where A=='foo' . However, Pandas doesn't work like that. I couldn't find a similar example

How to keep original index of a DataFrame after groupby 2 columns?

只谈情不闲聊 提交于 2021-02-18 04:54:44
问题 Is there any way I can retain the original index of my large dataframe after I perform a groupby? The reason I need to this is because I need to do an inner merge back to my original df (after my groupby) to regain those lost columns. And the index value is the only 'unique' column to perform the merge back into. Does anyone know how I can achieve this? My DataFrame is quite large. My groupby looks like this: df.groupby(['col1', 'col2']).agg({'col3': 'count'}).reset_index() This drops my

How to keep original index of a DataFrame after groupby 2 columns?

别说谁变了你拦得住时间么 提交于 2021-02-18 04:53:17
问题 Is there any way I can retain the original index of my large dataframe after I perform a groupby? The reason I need to this is because I need to do an inner merge back to my original df (after my groupby) to regain those lost columns. And the index value is the only 'unique' column to perform the merge back into. Does anyone know how I can achieve this? My DataFrame is quite large. My groupby looks like this: df.groupby(['col1', 'col2']).agg({'col3': 'count'}).reset_index() This drops my

How to filter dataframe by splitting categories of a columns into sets?

夙愿已清 提交于 2021-02-17 02:06:26
问题 I have a dataframe: Prop_ID Unit_ID Prop_Usage Unit_Usage 1 1 RESIDENTIAL RESIDENTIAL 1 2 RESIDENTIAL COMMERCIAL 1 3 RESIDENTIAL INDUSTRIAL 1 4 RESIDENTIAL RESIDENTIAL 2 1 COMMERCIAL RESIDENTIAL 2 2 COMMERCIAL COMMERCIAL 2 3 COMMERCIAL COMMERCIAL 3 1 INDUSTRIAL INDUSTRIAL 3 2 INDUSTRIAL COMMERCIAL 4 1 RESIDENTIAL - COMMERCIAL RESIDENTIAL 4 2 RESIDENTIAL - COMMERCIAL COMMERCIAL 4 3 RESIDENTIAL - COMMERCIAL INDUSTRIAL 5 1 COMMERCIAL / RESIDENTIAL RESIDENTIAL 5 2 COMMERCIAL / RESIDENTIAL

How to filter dataframe by splitting categories of a columns into sets?

你离开我真会死。 提交于 2021-02-17 02:06:10
问题 I have a dataframe: Prop_ID Unit_ID Prop_Usage Unit_Usage 1 1 RESIDENTIAL RESIDENTIAL 1 2 RESIDENTIAL COMMERCIAL 1 3 RESIDENTIAL INDUSTRIAL 1 4 RESIDENTIAL RESIDENTIAL 2 1 COMMERCIAL RESIDENTIAL 2 2 COMMERCIAL COMMERCIAL 2 3 COMMERCIAL COMMERCIAL 3 1 INDUSTRIAL INDUSTRIAL 3 2 INDUSTRIAL COMMERCIAL 4 1 RESIDENTIAL - COMMERCIAL RESIDENTIAL 4 2 RESIDENTIAL - COMMERCIAL COMMERCIAL 4 3 RESIDENTIAL - COMMERCIAL INDUSTRIAL 5 1 COMMERCIAL / RESIDENTIAL RESIDENTIAL 5 2 COMMERCIAL / RESIDENTIAL

Pandas groupby: how to select adjacent column data after selecting a row based on data in another column in pandas groupby groups?

三世轮回 提交于 2021-02-16 20:22:17
问题 I have a database as partially shown below. For each date, there are entries for duration (1-20 per date), with items (100s) listed for each duration. Each item has several associated data points in adjacent columns, including an identifier. For each date, I want to select the largest duration. Then, I want to find the item with a value closest to a given input value. I would like to then obtain the ID for that item to be able to follow the value of this item through its time in the database.

How to rank rows by id in Pandas Python

隐身守侯 提交于 2021-02-16 20:13:25
问题 I have a Dataframe like this: id points1 points2 1 44 53 1 76 34 1 63 66 2 23 34 2 44 56 I want output like this: id points1 points2 points1_rank points2_rank 1 44 53 3 2 1 76 34 1 3 1 63 66 2 1 2 23 79 2 1 2 44 56 1 2 Basically, I want to groupby('id') , and find the rank of each column with same id. I tried this: features = ["points1","points2"] df = pd.merge(df, df.groupby('id')[features].rank().reset_index(), suffixes=["", "_rank"], how='left', on=['id']) But I get keyerror 'id' 回答1: You

Transform pandas groupby result with subtotals to relative values

懵懂的女人 提交于 2021-02-16 13:54:11
问题 I have come accross a nice solution to insert subtotals into a pandas groupby dataframe. However, now I would like to modify the result to show relative values with respect to the subtotals, instead of the absolute values. This is the code to show the groupby: import pandas as pd import numpy as np df = pd.DataFrame( { "Category": np.random.choice(["Group A", "Group B"], 50), "Product": np.random.choice(["Product 1", "Product 2"], 50), "Units_Sold": np.random.randint(1, 100, size=(50)), "Date