Why is count(*) returning multiple results in this subquery (postgresql)?

放肆的年华 提交于 2019-12-13 18:01:23

问题


I want to group the results of a count(*) query into value buckets. I'm testing this on the dellstore2 postgresql sample database. My query below returns the right answers, but does so once for each row in the table (several thousand identical results). I can fix this by adding LIMIT 1 the the end of the query but I'd like to understand why I'm getting the duplicates in case it points to a wider problem with my approach. The query is:

SELECT
    (SELECT count(*)
        FROM
            orders
        WHERE
            totalamount > 0 AND totalamount <= 100) AS ">0 <= 100",
    (SELECT count(*)
        FROM
            orders
        WHERE
            totalamount > 100 AND totalamount <= 200) AS ">100 <= 200"
...
FROM
    orders;

EDIT Andomar's answer also allowed me to find the following approach (adapted from an example in SQL in a nutshell (O'Reilly)). This lets me have the buckets in one column, with a row for each bucket/answer pairing. I thought I'd include it for anyone with that use-case:

SELECT CASE
        WHEN totalamount IS NULL THEN 'Unknown'
        WHEN totalamount <= 100 THEN 'Not more than 100'
        WHEN totalamount <= 200 THEN 'Not more than 200'
        ELSE 'Over 200'
    END "Bucket",
    COUNT(*) "Number of results"
FROM
    orders
GROUP BY CASE
        WHEN totalamount IS NULL THEN 'Unknown'
        WHEN totalamount <= 100 THEN 'Not more than 100'
        WHEN totalamount <= 200 THEN 'Not more than 200'
        ELSE 'Over 200'
    END
ORDER BY
    MIN(totalamount);

回答1:


You're selecting every row from orders, and then for each row, the subqueries are evaluated.

Consider this approach instead:

select  count(case when 0 < totalamount and totalamount <= 100 then 1 end)
            as "<0,100]"
,       count(case when 100 < totalamount and totalamount <= 200 then 1 end)
            as "<100,200]"
from    Orders

This would calculate both aggregates in a single table scan.




回答2:


Because you included them as a subquery. From the looks of it, both the count subqueries are executed for each row in the orders table. If you omit the from piece from the main query you should only get one row.

For example, the query

SELECT 'John Doe' FROM orders

returns 'John Doe' for each row in the orders table.

SELECT 'John Doe'

only selects one row.

Disclaimer: this is sql server behaviour.



来源:https://stackoverflow.com/questions/9786442/why-is-count-returning-multiple-results-in-this-subquery-postgresql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!