pandas-groupby

update column value of pandas groupby().last()

会有一股神秘感。 提交于 2020-02-04 02:33:03
问题 Given dataframe: dfd = pd.DataFrame({'A': [1, 1, 2,2,3,3], 'B': [4, 5, 6,7,8,9], 'C':['a','b','c','c','d','e'] }) I can find the last C value of each A group by using dfd.groupby('A').last()['C'] However, I want to update the C values to np.nan. I don't know how to do that. Method such as: def replace(df): df['C']=np.nan return replace dfd.groupby('A').last().apply(lambda dfd: replace(dfd)) Does not work. I want the result like: dfd_result= pd.DataFrame({'A': [1, 1, 2,2,3,3], 'B': [4, 5, 6,7

update column value of pandas groupby().last()

痴心易碎 提交于 2020-02-04 02:29:46
问题 Given dataframe: dfd = pd.DataFrame({'A': [1, 1, 2,2,3,3], 'B': [4, 5, 6,7,8,9], 'C':['a','b','c','c','d','e'] }) I can find the last C value of each A group by using dfd.groupby('A').last()['C'] However, I want to update the C values to np.nan. I don't know how to do that. Method such as: def replace(df): df['C']=np.nan return replace dfd.groupby('A').last().apply(lambda dfd: replace(dfd)) Does not work. I want the result like: dfd_result= pd.DataFrame({'A': [1, 1, 2,2,3,3], 'B': [4, 5, 6,7

How to groupby with consecutive occurrence of duplicates in pandas

陌路散爱 提交于 2020-02-02 03:20:41
问题 I have a dataframe which contains two columns [Name,In.cl]. I want to groupby Name but it based on continuous occurrence. For example consider below DataFrame, Code to generate below DF: df=pd.DataFrame({'Name':['A','B','B','A','A','B','C','C','C','B','C'],'In.Cl':[2,1,5,2,4,2,3,1,8,5,7]}) Input: In.Cl Name 0 2 A 1 1 B 2 5 B 3 2 A 4 4 A 5 2 B 6 3 C 7 1 C 8 8 C 9 5 B 10 7 C I want to group the rows where it repeated consecutively. example group [B] (1,2), [A] (3,4), [C] (6,8) etc., and perform

Group by and sum over rows with same contents [duplicate]

谁都会走 提交于 2020-01-30 12:44:37
问题 This question already has an answer here : row wise sorting in pandas dataframe and aggregation (1 answer) Closed 8 months ago . I have a data frame of 3 columns with numerical values, first two columns are a set with two elements. I want to treat the rows of these 2 columns as a set (that contains the same elements) and group by + sum: df.groupby([A,B]).sum() --- won't work here example: A B counter 750 1334 10 1080 1920 15 1080 1920 10 1920 1080 10 1125 2436 20 result : A B counter 750 1334

Create column based on multiple column conditions from another dataframe

喜你入骨 提交于 2020-01-30 09:17:06
问题 Suppose I have two dataframes - conditions and data. import pandas as pd conditions = pd.DataFrame({'class': [1,2,3,4,4,5,5,4,4,5,5,5], 'primary_lower': [0,0,0,160,160,160,160,160,160,160,160,800], 'primary_upper':[9999,9999,9999,480,480,480,480,480,480,480,480,4000], 'secondary_lower':[0,0,0,3500,6100,3500,6100,0,4800,0,4800,10], 'secondary_upper':[9999,9999,9999,4700,9999,4700,9999,4699,6000,4699,6000,3000], 'group':['A','A','A','B','B','B','B','C','C','C','C','C']}) data = pd.DataFrame({

Get only the first and last rows of each group with pandas

╄→尐↘猪︶ㄣ 提交于 2020-01-30 05:32:05
问题 Iam newbie in python. I have huge a dataframe with millions of rows and id. my data looks like this: Time ID X Y 8:00 A 23 100 9:00 B 24 110 10:00 B 25 120 11:00 C 26 130 12:00 C 27 140 13:00 A 28 150 14:00 A 29 160 15:00 D 30 170 16:00 C 31 180 17:00 B 32 190 18:00 A 33 200 19:00 C 34 210 20:00 A 35 220 21:00 B 36 230 22:00 C 37 240 23:00 B 38 250 I want to sort the data on id and time. The expected result what I looking for like this" Time ID X Y 8:00 A 23 100 13:00 A 28 150 14:00 A 29 160

python pandas group by and aggregate columns

牧云@^-^@ 提交于 2020-01-25 02:59:24
问题 I am using panda version 0.23.0. I want to use data frame group by function to generate new aggregated columns using [lambda] functions.. My data frame looks like ID Flag Amount User 1 1 100 123345 1 1 55 123346 2 0 20 123346 2 0 30 123347 3 0 50 123348 I want to generate a table which looks like ID Flag0_Count Flag1_Count Flag0_Amount_SUM Flag1_Amount_SUM Flag0_User_Count Flag1_User_Count 1 2 2 0 155 0 2 2 2 0 50 0 2 0 3 1 0 50 0 1 0 here: Flag0_Count is count of Flag = 0 Flag1_Count is

Count values in dataframe based on entry

两盒软妹~` 提交于 2020-01-24 19:30:09
问题 I have a dataframe of the form: category | value | cat a |x | cat a |x | cat a |y | cat b |w | cat b |z | I'd like to be able to return something like (showing unique values and frequency) category | freq of most common value |most common value | cat a 2 x cat b 1 w #(it doesnt matter if here is an w or z) 回答1: Use Series.value_counts with Series.head per groups in lambda function: df = (df.groupby('category', sort=False)['value'] .apply(lambda x: x.value_counts().head(1)) .reset_index()

How to create a new row on the fly by copying previous row

99封情书 提交于 2020-01-24 12:29:07
问题 I have a dataframe like as given below edited dataframe df = pd.DataFrame({ 'subject_id':[1,1,1,1,1,1,1,2,2,2,2,2], 'time_1' :['2173-04-03 12:35:00','2173-04-03 12:50:00','2173-04-05 12:59:00','2173-05-04 13:14:00','2173-05-05 13:37:00','2173-07-06 13:39:00','2173-07-08 11:30:00','2173-04-08 16:00:00','2173-04-09 22:00:00','2173-04-11 04:00:00','2173- 04-13 04:30:00','2173-04-14 08:00:00'], 'val' :[5,5,5,5,1,6,5,5,8,3,4,6]}) df['time_1'] = pd.to_datetime(df_yes['time_1']) df['day'] = df['time

Aggregating string columns using pandas GroupBy

我的梦境 提交于 2020-01-23 11:08:38
问题 I have a DF such as the following: df = vid pos value sente 1 a A 21 2 b B 21 3 b A 21 3 a A 21 1 d B 22 1 a C 22 1 a D 22 2 b A 22 3 a A 22 Now I want to combine all rows with the same value for sente and vid into one row with the values for value joined by an " " df2 = vid pos value sente 1 a A 21 2 b B 21 3 b a A A 21 1 d a a B C D 22 2 b A 22 3 a A 22 I suppose a modification of this should do the trick: df2 = df.groupby["sente"].agg(lambda x: " ".join(x)) But I can't seem to figure out