group-by | 易学教程

SELECT / GROUP BY - sequence of a value

阅读更多关于 SELECT / GROUP BY - sequence of a value

问题 I have a table like this: CREATE TABLE tbltest ( Col1 Integer(10) UNSIGNED, Col2 Integer(10) UNSIGNED ) An there is the sample values: Col1 Col2 --------- 1 1 2 1 3 1 4 1 5 1 6 2 7 2 8 2 9 2 10 2 11 1 12 1 13 1 14 1 15 1 I want to SELECT SUM(col1) GROUPBY sequence values of col2 not by all values. With above data results is like this: col1 col2 --------- 15 1 40 2 75 1 I don't have any idea how the query is looks like. Also forgive me for this bad title. I don't know what it should be called.

Results from LINQ with SUM for TimeSpan, GROUP and JOIN [closed]

阅读更多关于 Results from LINQ with SUM for TimeSpan, GROUP and JOIN [closed]

问题 As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 7 years ago . I've following EF model: I want to get data in following view: In Data entity Time is Timespan with milliseconds. In result,

Filter and Group Result Between Time Ranges

阅读更多关于 Filter and Group Result Between Time Ranges

问题 timesetup table (Unnormalized), this table is for time conditioning that used for filtering result | time_id | period_from | period_to | session1_from | session1_to | session2_from | session2_to | session3_from | session3_to | session4_from | session4_to | session5_from | session5_to | |---------|-------------|------------|---------------|-------------|---------------|-------------|---------------|-------------|---------------|-------------|---------------|-------------| | 1 | 10/09/2015 | 11

get first record in group by result base on condition

阅读更多关于 get first record in group by result base on condition

问题 this is my database structure create database testGroupfirst; go use testGroupfirst; go create table testTbl ( id int primary key identity,name nvarchar(50) ,year int ,degree int , place nvarchar(50) ) insert into testTbl values ('jack',2015,50,'giza') insert into testTbl values ('jack',2016,500,'cai') insert into testTbl values ('jack',2017,660,'alex') insert into testTbl values ('jack',2018,666,'giza') insert into testTbl values ('jack',2011,50,'alex') insert into testTbl values ('rami'

Spark (pySpark) groupBy misordering first element on collect_list

阅读更多关于 Spark (pySpark) groupBy misordering first element on collect_list

问题 I have the following dataframe (df_parquet): DataFrame[id: bigint, date: timestamp, consumption: decimal(38,18)] I intend to get sorted lists of dates and consumptions using collect_list, just as stated in this post: collect_list by preserving order based on another variable I am following the last approach (https://stackoverflow.com/a/49246162/11841618), which is the one i think its more efficient. So instead of just calling repartition with the default number of partitions (200) i call it

Why is it not sufficient to group by a primary key?

阅读更多关于 Why is it not sufficient to group by a primary key?

问题 Suppose I have a query like this: SELECT items.item_id, items.name GROUP_CONCAT(graphics.graphic_id) AS graphic_ids FROM order_items items LEFT JOIN order_graphics graphics ON graphics.item_id = items.item_id WHERE // etc GROUP BY items.item_id As I understand it, the proper thing to do is to include every unaggregated column in the GROUP_BY like so: GROUP BY items.item_id, items.name This is to prevent records from being lost because MySQL doesn't know how to group them. However, I'm not

MySQL: Optimization GROUP BY multiple keys

阅读更多关于 MySQL: Optimization GROUP BY multiple keys

问题 I have a table PAYMENTS in MySql database: CREATE TABLE `PAYMENTS` ( `ID` BIGINT(20) NOT NULL AUTO_INCREMENT, `USER_ID` BIGINT(20) NOT NULL, `CATEGORY_ID` BIGINT(20) NOT NULL, `AMOUNT` DOUBLE NULL DEFAULT NULL, PRIMARY KEY (`ID`), INDEX `PAYMENT_INDEX1` (`USER_ID`), INDEX `PAYMENT_INDEX2` (`CATEGORY_ID`), INDEX `PAYMENT_INDEX3` (`CATEGORY_ID`, `USER_ID`) ) ENGINE=InnoDB; I want to get summary amount for each user in ech category. Here is script: select sum(AMOUNT), USER_ID, CATEGORY_ID from

Python: Get most frequent item in list

阅读更多关于 Python: Get most frequent item in list

问题 I've got a list of tuples, and I want to get the most frequently occurring tuple BUT if there are "joint winners" it should pick between them at random. tups = [ (1,2), (3,4), (5,6), (1,2), (3,4) ] so I want something that would return either (1,2) or (3,4) at random for the above list 回答1: You can first use Counter to find the most repeated tuple. Then find the required tuples and finally randomize and get the first value. from collections import Counter import random tups = [ (1,2), (3,4),

Groupby two columns ignoring order of pairs

阅读更多关于 Groupby two columns ignoring order of pairs

问题 Suppose we have a dataframe that looks like this: start stop duration 0 A B 1 1 B A 2 2 C D 2 3 D C 0 What's the best way to construct a list of: i) start/stop pairs; ii) count of start/stop pairs; iii) avg duration of start/stop pairs? In this case, order should not matter: (A,B)=(B,A) . Desired output: [[start,stop,count,avg duration]] In this example: [[A,B,2,1.5],[C,D,2,1]] 回答1: sort the first two columns (you can do this in-place, or create a copy and do the same thing; I've done the

Groupby two columns ignoring order of pairs

阅读更多关于 Groupby two columns ignoring order of pairs