pandas-groupby | 易学教程

update column value of pandas groupby().last()

阅读更多关于 update column value of pandas groupby().last()

问题 Given dataframe: dfd = pd.DataFrame({'A': [1, 1, 2,2,3,3], 'B': [4, 5, 6,7,8,9], 'C':['a','b','c','c','d','e'] }) I can find the last C value of each A group by using dfd.groupby('A').last()['C'] However, I want to update the C values to np.nan. I don't know how to do that. Method such as: def replace(df): df['C']=np.nan return replace dfd.groupby('A').last().apply(lambda dfd: replace(dfd)) Does not work. I want the result like: dfd_result= pd.DataFrame({'A': [1, 1, 2,2,3,3], 'B': [4, 5, 6,7

update column value of pandas groupby().last()

阅读更多关于 update column value of pandas groupby().last()

How to groupby with consecutive occurrence of duplicates in pandas

阅读更多关于 How to groupby with consecutive occurrence of duplicates in pandas

问题 I have a dataframe which contains two columns [Name,In.cl]. I want to groupby Name but it based on continuous occurrence. For example consider below DataFrame, Code to generate below DF: df=pd.DataFrame({'Name':['A','B','B','A','A','B','C','C','C','B','C'],'In.Cl':[2,1,5,2,4,2,3,1,8,5,7]}) Input: In.Cl Name 0 2 A 1 1 B 2 5 B 3 2 A 4 4 A 5 2 B 6 3 C 7 1 C 8 8 C 9 5 B 10 7 C I want to group the rows where it repeated consecutively. example group [B] (1,2), [A] (3,4), [C] (6,8) etc., and perform

Group by and sum over rows with same contents [duplicate]

阅读更多关于 Group by and sum over rows with same contents [duplicate]

问题 This question already has an answer here : row wise sorting in pandas dataframe and aggregation (1 answer) Closed 8 months ago . I have a data frame of 3 columns with numerical values, first two columns are a set with two elements. I want to treat the rows of these 2 columns as a set (that contains the same elements) and group by + sum: df.groupby([A,B]).sum() --- won't work here example: A B counter 750 1334 10 1080 1920 15 1080 1920 10 1920 1080 10 1125 2436 20 result : A B counter 750 1334

Create column based on multiple column conditions from another dataframe

阅读更多关于 Create column based on multiple column conditions from another dataframe

问题 Suppose I have two dataframes - conditions and data. import pandas as pd conditions = pd.DataFrame({'class': [1,2,3,4,4,5,5,4,4,5,5,5], 'primary_lower': [0,0,0,160,160,160,160,160,160,160,160,800], 'primary_upper':[9999,9999,9999,480,480,480,480,480,480,480,480,4000], 'secondary_lower':[0,0,0,3500,6100,3500,6100,0,4800,0,4800,10], 'secondary_upper':[9999,9999,9999,4700,9999,4700,9999,4699,6000,4699,6000,3000], 'group':['A','A','A','B','B','B','B','C','C','C','C','C']}) data = pd.DataFrame({

Get only the first and last rows of each group with pandas

阅读更多关于 Get only the first and last rows of each group with pandas

问题 Iam newbie in python. I have huge a dataframe with millions of rows and id. my data looks like this: Time ID X Y 8:00 A 23 100 9:00 B 24 110 10:00 B 25 120 11:00 C 26 130 12:00 C 27 140 13:00 A 28 150 14:00 A 29 160 15:00 D 30 170 16:00 C 31 180 17:00 B 32 190 18:00 A 33 200 19:00 C 34 210 20:00 A 35 220 21:00 B 36 230 22:00 C 37 240 23:00 B 38 250 I want to sort the data on id and time. The expected result what I looking for like this" Time ID X Y 8:00 A 23 100 13:00 A 28 150 14:00 A 29 160

python pandas group by and aggregate columns

阅读更多关于 python pandas group by and aggregate columns

问题 I am using panda version 0.23.0. I want to use data frame group by function to generate new aggregated columns using [lambda] functions.. My data frame looks like ID Flag Amount User 1 1 100 123345 1 1 55 123346 2 0 20 123346 2 0 30 123347 3 0 50 123348 I want to generate a table which looks like ID Flag0_Count Flag1_Count Flag0_Amount_SUM Flag1_Amount_SUM Flag0_User_Count Flag1_User_Count 1 2 2 0 155 0 2 2 2 0 50 0 2 0 3 1 0 50 0 1 0 here: Flag0_Count is count of Flag = 0 Flag1_Count is

Count values in dataframe based on entry

阅读更多关于 Count values in dataframe based on entry

How to create a new row on the fly by copying previous row

阅读更多关于 How to create a new row on the fly by copying previous row

问题 I have a dataframe like as given below edited dataframe df = pd.DataFrame({ 'subject_id':[1,1,1,1,1,1,1,2,2,2,2,2], 'time_1' :['2173-04-03 12:35:00','2173-04-03 12:50:00','2173-04-05 12:59:00','2173-05-04 13:14:00','2173-05-05 13:37:00','2173-07-06 13:39:00','2173-07-08 11:30:00','2173-04-08 16:00:00','2173-04-09 22:00:00','2173-04-11 04:00:00','2173- 04-13 04:30:00','2173-04-14 08:00:00'], 'val' :[5,5,5,5,1,6,5,5,8,3,4,6]}) df['time_1'] = pd.to_datetime(df_yes['time_1']) df['day'] = df['time

Aggregating string columns using pandas GroupBy

阅读更多关于 Aggregating string columns using pandas GroupBy

问题 I have a DF such as the following: df = vid pos value sente 1 a A 21 2 b B 21 3 b A 21 3 a A 21 1 d B 22 1 a C 22 1 a D 22 2 b A 22 3 a A 22 Now I want to combine all rows with the same value for sente and vid into one row with the values for value joined by an " " df2 = vid pos value sente 1 a A 21 2 b B 21 3 b a A A 21 1 d a a B C D 22 2 b A 22 3 a A 22 I suppose a modification of this should do the trick: df2 = df.groupby["sente"].agg(lambda x: " ".join(x)) But I can't seem to figure out