pandas-groupby

Generate random no of days based on random selection of columns

僤鯓⒐⒋嵵緔 提交于 2020-08-19 08:43:22
问题 I have a dataframe like as shown below. Thanks to SO community for helping with the below df1 = pd.DataFrame({'person_id': [11,11, 12, 13, 14], 'date_birth': ['01/01/1961','12/30/1961', '05/29/1967', '01/01/1957', '7/27/1959']}) df1 = df1.melt('person_id', value_name='dates') df1['dates'] = pd.to_datetime(df1['dates']) df_ranges = df1.assign(until_prev_year_days=(df1['dates'].dt.dayofyear - 1), until_next_year_days=((df1['dates'] + pd.offsets.YearEnd(0)) - df1['dates']).dt.days) f = {'until

Generate random no of days based on random selection of columns

核能气质少年 提交于 2020-08-19 08:43:06
问题 I have a dataframe like as shown below. Thanks to SO community for helping with the below df1 = pd.DataFrame({'person_id': [11,11, 12, 13, 14], 'date_birth': ['01/01/1961','12/30/1961', '05/29/1967', '01/01/1957', '7/27/1959']}) df1 = df1.melt('person_id', value_name='dates') df1['dates'] = pd.to_datetime(df1['dates']) df_ranges = df1.assign(until_prev_year_days=(df1['dates'].dt.dayofyear - 1), until_next_year_days=((df1['dates'] + pd.offsets.YearEnd(0)) - df1['dates']).dt.days) f = {'until

groupby cumulative in pandas then update using numpy based specific condition

拜拜、爱过 提交于 2020-08-10 23:04:21
问题 I have a data frame as shown below. B_ID No_Show Session slot_num Patient_count 1 0.4 S1 1 1 2 0.3 S1 2 1 3 0.8 S1 3 1 4 0.3 S1 3 2 5 0.6 S1 4 1 6 0.8 S1 5 1 7 0.9 S1 5 2 8 0.4 S1 5 3 9 0.6 S1 5 4 12 0.9 S2 1 1 13 0.5 S2 1 2 14 0.3 S2 2 1 15 0.7 S2 3 1 20 0.7 S2 4 1 16 0.6 S2 5 1 17 0.8 S2 5 2 19 0.3 S2 5 3 From the above I would like to find the cumulative No_show by Session df['Cum_No_show'] = df.groupby(['Session'])['No_Show'].cumsum() No we get B_ID No_Show Session slot_num Patient_count

Pandas: map column using a dictionary on multiple columns

南楼画角 提交于 2020-08-10 17:42:07
问题 I have a dataframe with None values in one column. I would like to replace this None values with the maximum value of the "category" for the same combination of other columns. Example: pandas dataframe import pandas as pd d = {'company': ['Company1', 'Company1', 'Company1', 'Company1', 'Company2', 'Company2'], 'product': ['Product A', 'Product A', 'Product F', 'Product A', 'Product F', 'Product F'], 'category': ['1', None, '3', '2', None, '5']} df = pd.DataFrame(d) company product category 0

Python - Grouping and Assigning Exception Rules

孤人 提交于 2020-08-10 03:38:21
问题 I would like to group by list first by assigning group 1, if the closest negative diff to 0 is Location 86 as Group 1, and I would like to assign Group 2 if the closest negative diff to 0 is Location 90. And then group 3 would be if Location 86 and 90 are the closest. After this set is run, I would rerun the code and anywhere a Group has not been assigned, it begins assigning starting from Group 4 and on, so as to not override the previous group assignments. The groupby is occurring based on

How do I use .loc with groupby so that creating a new column based on grouped data won't be considered a copy?

痞子三分冷 提交于 2020-08-10 01:17:51
问题 I have a CSV file with groups of data, and am using the groupby() method to segregate them. Each group is processed by a bit of simple math that includes the use of min() and max() for a couple of columns, along with a bit of subtraction and multiplication to create a new column of data. I then graph each group. This mostly works okay, but I have two complaints about my code - graphs are individual, not combined as I would prefer; I get "SettingWithCopyWarning" with each group. From my

Combination of columns for aggregation after groupby

守給你的承諾、 提交于 2020-08-09 19:06:12
问题 Question Looking for something like df.groubpy('key').aggregate(combination(columnA, columnB)) instead of df['combination'] = combination(columnA, columnB) df.groupby('key')['combination'].aggregate() The only requirement is that the combination of columns is calculated after the groupby. Description I seems natural, logically wise, for some cases to first groupby and then aggregate. One example would be different aggregate functions for different combinations of columns that use the same

Dynamically merge lines that share the same key into one

纵然是瞬间 提交于 2020-08-09 08:49:47
问题 I have a Dataframe and would like to make another column that combines the columns whose name begins with the same value in Answer and QID. That is to say, here is an exerpt of the dataframe: QID Category Text QType Question Answer0 Answer1 0 16 Automotive Access to car Single Do you have access to a car? I own a car/cars I own a car/cars 1 16 Automotive Access to car Single Do you have access to a car? I lease/ have a company car I lease/have a company car 2 16 Automotive Access to car