group-by

How to apply rolling functions in a group by object in pandas

China☆狼群 提交于 2019-12-21 12:59:29
问题 I'm having difficulty to solve a look-back or roll-over problem in dataframe or perhaps in groupby. The following is a simple example of the dataframe I have: fruit amount 20140101 apple 3 20140102 apple 5 20140102 orange 10 20140104 banana 2 20140104 apple 10 20140104 orange 4 20140105 orange 6 20140105 grape 1 … 20141231 apple 3 20141231 grape 2 I need to calculate the average value of 'amount' of each fruit in the previous 3 days for everyday, and create the following data frame: fruit

How to apply rolling functions in a group by object in pandas

旧巷老猫 提交于 2019-12-21 12:59:26
问题 I'm having difficulty to solve a look-back or roll-over problem in dataframe or perhaps in groupby. The following is a simple example of the dataframe I have: fruit amount 20140101 apple 3 20140102 apple 5 20140102 orange 10 20140104 banana 2 20140104 apple 10 20140104 orange 4 20140105 orange 6 20140105 grape 1 … 20141231 apple 3 20141231 grape 2 I need to calculate the average value of 'amount' of each fruit in the previous 3 days for everyday, and create the following data frame: fruit

PostgreSQL: joining arrays within group by clause

泄露秘密 提交于 2019-12-21 12:57:53
问题 We have a problem grouping arrays into a single array. We want to join the values from two colums into one single array and aggregate these arrays of multiple rows. Given the following input: | id | name | col_1 | col_2 | | 1 | a | 1 | 2 | | 2 | a | 3 | 4 | | 4 | b | 7 | 8 | | 3 | b | 5 | 6 | We want the following output: | a | { 1, 2, 3, 4 } | | b | { 5, 6, 7, 8 } | The order of the elements is important and should correlate with the id of the aggregated rows. We tried the array_agg function

Group by one columns and find sum and max value for another in pandas

。_饼干妹妹 提交于 2019-12-21 12:47:24
问题 I have a dataframe like this: Name id col1 col2 col3 cl4 PL 252 0 747 3 53 PL2 252 1 24 2 35 PL3 252 4 75 24 13 AD 889 53 24 0 95 AD2 889 23 2 0 13 AD3 889 0 24 3 6 BG 024 12 89 53 66 BG1 024 43 16 13 0 BG2 024 5 32 101 4 And now I need to group by ID, and for columns col1 and col4 find the sum for each id and put that into a new column near to parent column (example: col3(sum)) But for col2 and col3 find max value. Desired output: Name id col1 col1(sum) col2 col2(max) col3 col(max) col4 col4

Group by one columns and find sum and max value for another in pandas

耗尽温柔 提交于 2019-12-21 12:47:05
问题 I have a dataframe like this: Name id col1 col2 col3 cl4 PL 252 0 747 3 53 PL2 252 1 24 2 35 PL3 252 4 75 24 13 AD 889 53 24 0 95 AD2 889 23 2 0 13 AD3 889 0 24 3 6 BG 024 12 89 53 66 BG1 024 43 16 13 0 BG2 024 5 32 101 4 And now I need to group by ID, and for columns col1 and col4 find the sum for each id and put that into a new column near to parent column (example: col3(sum)) But for col2 and col3 find max value. Desired output: Name id col1 col1(sum) col2 col2(max) col3 col(max) col4 col4

How to Use Group By clause when we use Aggregate function in the Joins?

[亡魂溺海] 提交于 2019-12-21 10:49:55
问题 I want to join three tables and to calculate the Sum(Quantity) of the Table A. I tried something and I get the desired output. But still I have confusion based on aggregate function and Group By clause. While calculating the sum value by joining two or more tables, what are the columns we need to mention in the Group By clause and why do we need to give those columns? For Example: Here is my table and the desired query. TableA: ItemID, JobOrderID, CustomerID, DivisionID, Quantity TableB:

Python Pandas Choosing Random Sample of Groups from Groupby

江枫思渺然 提交于 2019-12-21 09:21:42
问题 What is the best way to get a random sample of the elements of a groupby ? As I understand it, a groupby is just an iterable over groups. The standard way I would do this for an iterable, if I wanted to select N = 200 elements is: rand = random.sample(data, N) If you attempt the above where data is a 'grouped' the elements of the resultant list are tuples for some reason. I found the below example for randomly selecting the elements of a single key groupby , however this does not work with a

SQL Server Weird Grouping Scenario by multiple columns and OR

落花浮王杯 提交于 2019-12-21 09:18:53
问题 I have a weird grouping scenario and have some troubles finding out what would be the best way for grouping in SQL. Imagine we have the following one table CREATE TABLE Item ( KeyId VARCHAR(1) NOT NULL, Col1 INT NULL, Col2 INT NULL, Col3 INT NULL ) GO INSERT INTO Item (KeyId, Col1, Col2, Col3) VALUES ('a',1,2,3), ('b',5,4,3), ('c',5,7,6), ('d',8,7,9), ('e',11,10,9), ('f',11,12,13), ('g',20,22,21), ('h',23,22,24) I need to group records in this table so that if Col1 OR Col2 OR Col3 is the same

SQL Server Weird Grouping Scenario by multiple columns and OR

冷暖自知 提交于 2019-12-21 09:18:07
问题 I have a weird grouping scenario and have some troubles finding out what would be the best way for grouping in SQL. Imagine we have the following one table CREATE TABLE Item ( KeyId VARCHAR(1) NOT NULL, Col1 INT NULL, Col2 INT NULL, Col3 INT NULL ) GO INSERT INTO Item (KeyId, Col1, Col2, Col3) VALUES ('a',1,2,3), ('b',5,4,3), ('c',5,7,6), ('d',8,7,9), ('e',11,10,9), ('f',11,12,13), ('g',20,22,21), ('h',23,22,24) I need to group records in this table so that if Col1 OR Col2 OR Col3 is the same

MySQL: “order by” inside of “group by”

假装没事ソ 提交于 2019-12-21 09:06:11
问题 I have a MySQL table of names , which consists of two fields: name and rank . The name value is not unique can have multiple matches. The problem: I want to select records, grouped by name , but if there are more than one name , the one with the highest rank should be taken. An example: Tom 2 Ben 1 Ben 2 SELECT * FROM names GROUP BY name ORDER BY rank DESC Usually returns: Tom 2 Ben 1 I need: Tom 2 Ben 2 Since there are two Bens, but the second one with a higher rank. It seems, that MySQL