How to order distinct tuples in a PostgreSQL query

China☆狼群 提交于 2019-12-05 15:10:54

The leftmost ORDER BY items cannot disagree with the items of the DISTINCT clause. I quote the manual about DISTINCT:

The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). The ORDER BY clause will normally contain additional expression(s) that determine the desired precedence of rows within each DISTINCT ON group.

Try:

SELECT *
FROM  (
    SELECT DISTINCT ON (c.cluster_id, feed_id) 
           c.cluster_id, num_docs, feed_id, url_time 
    FROM   url_info u
    JOIN   cluster_info c ON (c.cluster_id = u.cluster_id) 
    WHERE  feed_id IN (SELECT pot_seeder FROM potentials) 
    AND    num_docs > 5
    AND    url_time > '2012-04-16'
    ORDER  BY c.cluster_id, feed_id, num_docs, url_time
           -- first columns match DISTINCT
           -- the rest to pick certain values for dupes
           -- or did you want to pick random values for dupes?
    ) x
ORDER  BY num_docs DESC;

Or use GROUP BY:

SELECT c.cluster_id
     , num_docs
     , feed_id
     , url_time 
FROM   url_info u
JOIN   cluster_info c ON (c.cluster_id = u.cluster_id) 
WHERE  feed_id IN (SELECT pot_seeder FROM potentials) 
AND    num_docs > 5
AND    url_time > '2012-04-16'
GROUP  BY c.cluster_id, feed_id 
ORDER  BY num_docs DESC;

If c.cluster_id, feed_id are the primary key columns of all (both in this case) tables that you include columns from in the SELECT list, then this just works with PostgreSQL 9.1 or later.

Else you need to GROUP BY the rest of the columns or aggregate or provide more information.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!