pandas-groupby

Selecting groups fromed by groupby function

て烟熏妆下的殇ゞ 提交于 2019-12-23 12:18:40
问题 My dataframe: df1 group ordercode quantity 0 A 1 B 3 1 C 1 E 2 D 1 I have formed each group by groupby function. I need to extract the data by using group number. My desired ouput. In:get group 0 out: ordercode quantity A 1 B 3 or group ordercode quantity 0 A 1 B 3 any suggestion would be appreciated. 回答1: Use DataFrame.xs, also is possible use parameter drop_level=False : #if need remove original level df1 = df.xs(0) print (df1) quantity ordercode A 1 B 3 #if avoid remove original level df1

Difference between “as_index = False”, and “reset_index()” in pandas groupby

自作多情 提交于 2019-12-23 12:07:43
问题 I just wanted to know what is the difference in the function performed by these 2. Data: import pandas as pd df = pd.DataFrame({"ID":["A","B","A","C","A","A","C","B"], "value":[1,2,4,3,6,7,3,4]}) as_index=False : df_group1 = df.groupby("ID").sum().reset_index() reset_index() : df_group2 = df.groupby("ID", as_index=False).sum() Both of them give the exact same output. ID value 0 A 18 1 B 6 2 C 6 Can anyone tell me what is the difference and any example illustrating the same? 回答1: When you use

Keep columns after a groupby in an empty dataframe

廉价感情. 提交于 2019-12-23 10:07:13
问题 The dataframe is an empty df after query.when groupby,raise runtime waring,then get another empty dataframe with no columns.How to keep the columns? df = pd.DataFrame(columns=["PlatformCategory","Platform","ResClassName","Amount"]) print df result: Empty DataFrame Columns: [PlatformCategory, Platform, ResClassName, Amount] Index: [] then groupby: df = df.groupby(["PlatformCategory","Platform","ResClassName"]).sum() df = df.reset_index(drop=False,inplace=True) print df result: sometimes is

Bar graph from dataframe groupby

旧巷老猫 提交于 2019-12-23 08:54:10
问题 import pandas as pd import numpy as np import matplotlib.pyplot as plt df = pd.read_csv("arrests.csv") df = df.replace(np.nan,0) df = df.groupby(['home_team'])['arrests'].mean() I'm trying to create a bar graph for dataframe. Under home_team are a bunch of team names. Under arrests are a number of arrests at each date. I've basically grouped the data by teams with the average arrests for that team. I'm trying to create a bar graph for this but am not sure how to proceed since one column doesn

New column in pandas - adding series to dataframe by applying a list groupby

好久不见. 提交于 2019-12-23 08:11:56
问题 Give the following df Id other concat 0 A z 1 1 A y 2 2 B x 3 3 B w 4 4 B v 5 5 B u 6 I want the result with new column with grouped values as list Id other concat new 0 A z 1 [1, 2] 1 A y 2 [1, 2] 2 B x 3 [3, 4, 5, 6] 3 B w 4 [3, 4, 5, 6] 4 B v 5 [3, 4, 5, 6] 5 B u 6 [3, 4, 5, 6] This is similar to these questions: grouping rows in list in pandas groupby Replicating GROUP_CONCAT for pandas.DataFrame However, it is apply the grouping you get from df.groupby('Id')['concat'].apply(list) , which

New column in pandas - adding series to dataframe by applying a list groupby

房东的猫 提交于 2019-12-23 08:11:11
问题 Give the following df Id other concat 0 A z 1 1 A y 2 2 B x 3 3 B w 4 4 B v 5 5 B u 6 I want the result with new column with grouped values as list Id other concat new 0 A z 1 [1, 2] 1 A y 2 [1, 2] 2 B x 3 [3, 4, 5, 6] 3 B w 4 [3, 4, 5, 6] 4 B v 5 [3, 4, 5, 6] 5 B u 6 [3, 4, 5, 6] This is similar to these questions: grouping rows in list in pandas groupby Replicating GROUP_CONCAT for pandas.DataFrame However, it is apply the grouping you get from df.groupby('Id')['concat'].apply(list) , which

pandas groupby rolling uneven time

烂漫一生 提交于 2019-12-23 05:46:10
问题 I am having some trouble with pandas rolling. Here a simplify version of my dataset: df2 = pd.DataFrame({ 'A' : pd.Categorical(["test","train","test","train",'train','hello']), 'B' : (pd.Timestamp('2013-01-02 00:00:05'), pd.Timestamp('2013-01-02 00:00:10'), pd.Timestamp('2013-01-02 00:00:09'), pd.Timestamp('2013-01-02 00:01:05'), pd.Timestamp('2013-01-02 00:01:25'), pd.Timestamp('2013-01-02 00:02:05')), 'C' : 1.}).sort_values('A').reset_index(drop=True) >>> df2 A B C 0 hello 2013-01-02 00:02

Speeding up rolling sum calculation in pandas groupby

女生的网名这么多〃 提交于 2019-12-23 04:33:52
问题 I want to compute rolling sums group-wise for a large number of groups and I'm having trouble doing it acceptably quickly. Pandas has build-in methods for rolling and expanding calculations Here's an example: import pandas as pd import numpy as np obs_per_g = 20 g = 10000 obs = g * obs_per_g k = 20 df = pd.DataFrame( data=np.random.normal(size=obs * k).reshape(obs, k), index=pd.MultiIndex.from_product(iterables=[range(g), range(obs_per_g)]), ) To get rolling and expanding sums I can use df

how to find total of only one column in python pandas pivot table?

梦想的初衷 提交于 2019-12-23 04:24:56
问题 My data i get from excel like; Invoice Cost centre Invoice Category Price DataFeed Reporting Fequency RIM Retail QLD 22.25 WEB DWM R5M Retail SYD 22.25 BWH M ..... my pivot table is like; df = pd.read_excel(file_path, sheet_name='Invoice Details', usecols="E:F,I,L:M") df['Price'] = df['Price'].astype(float) df1 = df.groupby(["Invoice Cost Centre", "Invoice Category"]).agg({'Price': 'sum'}).reset_index() df = pd.pivot_table(df, index=["Invoice Cost Centre", "Invoice Category"], columns=['Price

Pandas groupby selecting only one value based on 2 groups and converting rest to 0

人走茶凉 提交于 2019-12-23 02:39:31
问题 I have a pandas data frame which has a datetime index which looks like this: df = Fruit Quantity 01/02/10 Apple 4 01/02/10 Apple 6 01/02/10 Pear 7 01/02/10 Grape 8 01/02/10 Grape 5 02/02/10 Apple 2 02/02/10 Fruit 6 02/02/10 Pear 8 02/02/10 Pear 5 Now for each date and for each fruit I only want one value (preferably the top one) and the rest of the fruit for the date to remain zero. So desired output is as follows: Fruit Quantity 01/02/10 Apple 4 01/02/10 Apple 0 01/02/10 Pear 7 01/02/10