group-by | 易学教程

How to apply rolling functions in a group by object in pandas

阅读更多关于 How to apply rolling functions in a group by object in pandas

问题 I'm having difficulty to solve a look-back or roll-over problem in dataframe or perhaps in groupby. The following is a simple example of the dataframe I have: fruit amount 20140101 apple 3 20140102 apple 5 20140102 orange 10 20140104 banana 2 20140104 apple 10 20140104 orange 4 20140105 orange 6 20140105 grape 1 … 20141231 apple 3 20141231 grape 2 I need to calculate the average value of 'amount' of each fruit in the previous 3 days for everyday, and create the following data frame: fruit

How to apply rolling functions in a group by object in pandas

阅读更多关于 How to apply rolling functions in a group by object in pandas

PostgreSQL: joining arrays within group by clause

阅读更多关于 PostgreSQL: joining arrays within group by clause

问题 We have a problem grouping arrays into a single array. We want to join the values from two colums into one single array and aggregate these arrays of multiple rows. Given the following input: | id | name | col_1 | col_2 | | 1 | a | 1 | 2 | | 2 | a | 3 | 4 | | 4 | b | 7 | 8 | | 3 | b | 5 | 6 | We want the following output: | a | { 1, 2, 3, 4 } | | b | { 5, 6, 7, 8 } | The order of the elements is important and should correlate with the id of the aggregated rows. We tried the array_agg function

Group by one columns and find sum and max value for another in pandas

阅读更多关于 Group by one columns and find sum and max value for another in pandas

问题 I have a dataframe like this: Name id col1 col2 col3 cl4 PL 252 0 747 3 53 PL2 252 1 24 2 35 PL3 252 4 75 24 13 AD 889 53 24 0 95 AD2 889 23 2 0 13 AD3 889 0 24 3 6 BG 024 12 89 53 66 BG1 024 43 16 13 0 BG2 024 5 32 101 4 And now I need to group by ID, and for columns col1 and col4 find the sum for each id and put that into a new column near to parent column (example: col3(sum)) But for col2 and col3 find max value. Desired output: Name id col1 col1(sum) col2 col2(max) col3 col(max) col4 col4

Group by one columns and find sum and max value for another in pandas

阅读更多关于 Group by one columns and find sum and max value for another in pandas

How to Use Group By clause when we use Aggregate function in the Joins?

阅读更多关于 How to Use Group By clause when we use Aggregate function in the Joins?

问题 I want to join three tables and to calculate the Sum(Quantity) of the Table A. I tried something and I get the desired output. But still I have confusion based on aggregate function and Group By clause. While calculating the sum value by joining two or more tables, what are the columns we need to mention in the Group By clause and why do we need to give those columns? For Example: Here is my table and the desired query. TableA: ItemID, JobOrderID, CustomerID, DivisionID, Quantity TableB:

Python Pandas Choosing Random Sample of Groups from Groupby

阅读更多关于 Python Pandas Choosing Random Sample of Groups from Groupby

问题 What is the best way to get a random sample of the elements of a groupby ? As I understand it, a groupby is just an iterable over groups. The standard way I would do this for an iterable, if I wanted to select N = 200 elements is: rand = random.sample(data, N) If you attempt the above where data is a 'grouped' the elements of the resultant list are tuples for some reason. I found the below example for randomly selecting the elements of a single key groupby , however this does not work with a

SQL Server Weird Grouping Scenario by multiple columns and OR

阅读更多关于 SQL Server Weird Grouping Scenario by multiple columns and OR

问题 I have a weird grouping scenario and have some troubles finding out what would be the best way for grouping in SQL. Imagine we have the following one table CREATE TABLE Item ( KeyId VARCHAR(1) NOT NULL, Col1 INT NULL, Col2 INT NULL, Col3 INT NULL ) GO INSERT INTO Item (KeyId, Col1, Col2, Col3) VALUES ('a',1,2,3), ('b',5,4,3), ('c',5,7,6), ('d',8,7,9), ('e',11,10,9), ('f',11,12,13), ('g',20,22,21), ('h',23,22,24) I need to group records in this table so that if Col1 OR Col2 OR Col3 is the same

SQL Server Weird Grouping Scenario by multiple columns and OR

阅读更多关于 SQL Server Weird Grouping Scenario by multiple columns and OR

MySQL: “order by” inside of “group by”

阅读更多关于 MySQL: “order by” inside of “group by”

问题 I have a MySQL table of names , which consists of two fields: name and rank . The name value is not unique can have multiple matches. The problem: I want to select records, grouped by name , but if there are more than one name , the one with the highest rank should be taken. An example: Tom 2 Ben 1 Ben 2 SELECT * FROM names GROUP BY name ORDER BY rank DESC Usually returns: Tom 2 Ben 1 I need: Tom 2 Ben 2 Since there are two Bens, but the second one with a higher rank. It seems, that MySQL