Is there a [straightforward] way to order results *first*, *then* group by another column, with SQL?

十年热恋 提交于 2019-12-01 06:01:56
Select a,b from (select a,b from table order by b) as c group by a;

Yes, grouping is done first, and it affects a single select whereas ordering affects all the results from all select statements in a union, such as:

select a, 'max', max(b) from tbl group by a
union all select a, 'min', min(b) from tbl group by a
order by 1, 2

(using field numbers in order by since I couldn't be bothered to name my columns). Each group by affects only its select, the order by affects the combined result set.

It seems that what you're after can be achieved with:

select A, max(B) from tbl group by A

This uses the max aggregation function to basically do your pre-group ordering (it doesn't actually sort it in any decent DBMS, rather it will simply choose the maximum from an suitable index if available).

SELECT DISTINCT a,b
FROM tbl t
WHERE b = (SELECT MAX(b) FROM tbl WHERE tbl.a = t.a);

According to your new rules (tested with PostgreSQL)


Query You'd Want:

SELECT    pr.phone_nr, pr.payed_ts, pr.payed_until_ts 
FROM      payment_receipts pr
JOIN      users
          ON (pr.phone_nr = users.phone_nr)
   JOIN      (select phone_nr, max(payed_until_ts) as payed_until_ts 
              from payment_receipts 
              group by phone_nr
             ) sub
             ON (    pr.phone_nr       = sub.phone_nr 
                 AND pr.payed_until_ts = sub.payed_until_ts)
ORDER BY  pr.phone_nr, pr.payed_ts, pr.payed_until_ts;


Original Answer (with updates):

CREATE TABLE foo (a NUMERIC, b TEXT, DATE);

INSERT INTO foo VALUES 
   (1,'a','2010-07-30'),
   (1,'b','2010-07-30'),
   (1,'c','2010-07-31'),
   (1,'d','2010-07-31'),
   (1,'a','2010-07-29'),
   (1,'c','2010-07-29'),
   (2,'a','2010-07-29'),
   (2,'a','2010-08-01');

-- table contents
SELECT * FROM foo ORDER BY c,a,b;
 a | b |     c      
---+---+------------
 1 | a | 2010-07-29
 1 | c | 2010-07-29
 2 | a | 2010-07-29
 1 | a | 2010-07-30
 1 | b | 2010-07-30
 1 | c | 2010-07-31
 1 | d | 2010-07-31
 2 | a | 2010-08-01

-- The following solutions both retrieve records based on the latest date
--    they both return the same result set, solution 1 is faster, solution 2
--    is easier to read

-- Solution 1: 
SELECT    foo.a, foo.b, foo.c 
FROM      foo
JOIN      (select a, max(c) as c from foo group by a) bar
  ON      (foo.a=bar.a and foo.c=bar.c)
ORDER BY  foo.a, foo.b, foo.c;

-- Solution 2: 
SELECT    a, b, MAX(c) AS c 
FROM      foo main
GROUP BY  a, b
HAVING    MAX(c) = (select max(c) from foo sub where main.a=sub.a group by a)
ORDER BY  a, b;

 a | b |     c      
---+---+------------
 1 | c | 2010-07-31
 1 | d | 2010-07-31
 2 | a | 2010-08-01
(3 rows)  


Comment:
1 is returned twice because their are multiple b values. This is acceptable (and advised). Your data should never have this problem, because c is based on b's value.

create table user_payments
(
    phone_nr int NOT NULL,
    payed_until_ts datetime NOT NULL
)

insert into user_payments
(phone_nr, payed_until_ts)
values
(1, '2016-01-28'), -- today
(1, '2016-01-27'), -- yesterday  
(2, '2016-01-27'), -- yesterday 
(2, '2016-01-29')  -- tomorrow

select phone_nr, MAX(payed_until_ts) as latest_payment
from user_payments
group by phone_nr

-- OUTPUT:
-- phone_nr latest_payment
-- 1        2016-01-28 00:00:00.000
-- 2        2016-01-29 00:00:00.000

In the above example, I have used datetime column but similar query should work for timestamp column.

The MAX function will basically do the "ORDER BY" payed_until_ts column and pick the latest value for each phone_nr. Also, you will get only one value for each phone_nr due to "GROUP BY" clause.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!