Postgres: Distinct but only for one column

后端 未结 4 1734
情深已故
情深已故 2020-12-02 07:08

I have a table on pgsql with names (having more than 1 mio. rows), but I have also many duplicates. I select 3 fields: id, name, metadata

4条回答
  •  庸人自扰
    2020-12-02 07:36

    To do a distinct on only one (or n) column(s):

    select distinct on (name)
        name, col1, col2
    from names
    

    This will return any of the rows containing the name. If you want to control which of the rows will be returned you need to order:

    select distinct on (name)
        name, col1, col2
    from names
    order by name, col1
    

    Will return the first row when ordered by col1.

    distinct on:

    SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the “first row” of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.

    The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). The ORDER BY clause will normally contain additional expression(s) that determine the desired precedence of rows within each DISTINCT ON group.

提交回复
热议问题