Group only certain rows with GROUP BY

爱⌒轻易说出口 提交于 2019-12-06 06:35:44
Gyrocode.com

According to this answer by @axiac, better solution in terms of compatibility and performance is shown below.

It is also explained in SQL Antipatterns book, Chapter 15: Ambiguous Groups.

To improve performance, combined index is also added for (group_id, price, id).

SOLUTION

SELECT a.id, a.name, a.group_id, a.price
FROM items a
LEFT JOIN items b 
ON a.group_id = b.group_id 
AND (a.price > b.price OR (a.price = b.price and a.id > b.id))
WHERE b.price is NULL;

See explanation on how it works for more details.

By accident as a side-effect this query works in my case where I needed to include ALL records with group_id equals to NULL AND one item from each group with the lowest price.

RESULT

+----+--------+----------+-------+
| id | name   | group_id | price |
+----+--------+----------+-------+
|  1 | Item A |     NULL | 10.00 | 
|  2 | Item B |     NULL | 20.00 | 
|  3 | Item C |     NULL | 30.00 | 
|  4 | Item D |        1 | 40.00 | 
|  5 | Item E |        2 | 50.00 | 
+----+--------+----------+-------+

EXPLAIN

+----+-------------+-------+------+-------------------------------+--------------------+---------+----------------------------+------+--------------------------+
| id | select_type | table | type | possible_keys                 | key                | key_len | ref                        | rows | Extra                    |
+----+-------------+-------+------+-------------------------------+--------------------+---------+----------------------------+------+--------------------------+
|  1 | SIMPLE      | a     | ALL  | NULL                          | NULL               | NULL    | NULL                       |    7 |                          | 
|  1 | SIMPLE      | b     | ref  | PRIMARY,id,items_group_id_idx | items_group_id_idx | 5       | agi_development.a.group_id |    1 | Using where; Using index | 
+----+-------------+-------+------+-------------------------------+--------------------+---------+----------------------------+------+--------------------------+

If group_id is always a positive value you can simplify it without GUID/RAND:

SELECT id, name, min(price) FROM items
GROUP BY COALESCE(group_id, -id); -- id is already unique

But both queries will not return a correct result if you change the order of Inserts, I'll add a Fiddle when it's working again...

Gordon's query should work as expected or you use an old trick to get another column for MIN: piggybacking.

You concat multiple columns as fixed length string, MIN column as #1 and apply the MIN on this string. In the next step you extract the columns again using matching SUBSTRING:

SELECT
   CASE WHEN grp > 0 THEN grp ELSE NULL END AS group_id
   ,CAST(SUBSTRING(x FROM 1 FOR 13) AS DECIMAL(10,2)) AS price
   ,SUBSTRING(x FROM 24) AS NAME
FROM
 (
   SELECT COALESCE(group_id, -id) AS grp
      -- results in a string like this
      -- '        50.00         5Item E'
      ,MIN(LPAD(CAST(price AS VARCHAR(13)),13) 
           || LPAD(CAST(id AS VARCHAR(10)),10)
           || NAME) AS x
   FROM items
   GROUP BY grp
 ) AS dt;

You can do this using where conditions:

SQLFiddle Demo

select t.*
from t
where t.group_id is null or
      t.price = (select min(t2.price)
                 from t t2
                 where t2.group_id = t.group_id
                );

Note that this returns all rows with the minimum price, if there is more than one for a given group.

EDIT:

I believe the following fixes the problem of multiple rows:

select t.*
from t
where t.group_id is null or
      t.id = (select t2.id
              from t t2
              where t2.group_id = t.group_id
              order by t2.price asc
              limit 1
             );

Unfortunately, SQL Fiddle is not working for me right now, so I cannot test it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!