Combining 3 SELECT statements to output 1 table

戏子无情 提交于 2021-02-05 08:51:38

问题


I have three queries with results.

Query 1:

SELECT DISTINCT employeeid, work.clientid, ROUND ((CAST (AVG(current_lawn_price) AS numeric) / CAST (AVG((((EXTRACT(HOUR FROM job_finish)*60) + EXTRACT(MIN FROM job_finish))) - ((EXTRACT(HOUR FROM job_start)*60) + EXTRACT(MIN FROM job_start))) AS numeric)) / 1, 2) AS under_over_1
FROM work
JOIN timesheet USING (date_linkid)
JOIN client USING (clientid)
WHERE employeeid = 1 AND workid < 557 AND workid > 188
GROUP BY employeeid, clientid ORDER BY clientid ASC;

employeeid | clientid | under_over_1
------------+----------+--------------
          1 |        3 |         0.54
          1 |        4 |         0.47
          1 |        6 |         0.45
          1 |        7 |         0.59
          . |        . |           .
          . |        . |           .

Query 2:

SELECT DISTINCT employeeid, work.clientid, ROUND ((CAST (AVG(current_lawn_price) AS numeric) / CAST (AVG((((EXTRACT(HOUR FROM job_finish)*60) + EXTRACT(MIN FROM job_finish))) - ((EXTRACT(HOUR FROM job_start)*60) + EXTRACT(MIN FROM job_start))) AS numeric)) / 1.31666666666667, 2) AS under_over_1
FROM work
JOIN timesheet USING (date_linkid)
JOIN client USING (clientid)
WHERE employeeid = 2
GROUP BY employeeid, clientid ORDER BY clientid ASC;

 employeeid | clientid | under_over_1
------------+----------+--------------
          2 |        2 |         1.01
          2 |        3 |         0.21
          2 |        4 |         0.71
          2 |        6 |         0.68
          . |        . |           .
          . |        . |           .

Query: 3

SELECT DISTINCT employeeid, work.clientid, ROUND ((CAST (AVG(current_lawn_price) AS numeric) / CAST (AVG((((EXTRACT(HOUR FROM job_finish)*60) + EXTRACT(MIN FROM job_finish))) - ((EXTRACT(HOUR FROM job_start)*60) + EXTRACT(MIN FROM job_start))) AS numeric)) / 1.31666666666667, 2) AS under_over_1
FROM work
JOIN timesheet USING (date_linkid)
JOIN client USING (clientid)
WHERE employeeid = 3
GROUP BY employeeid, clientid ORDER BY clientid ASC;

 employeeid | clientid | under_over_1
------------+----------+--------------
          3 |        4 |         0.70
          3 |        6 |         0.54
          3 |        7 |         1.03
          3 |       11 |         0.74
          . |        . |           .
          . |        . |           .

I want to output one table with the result of all three queries such as (I am sorry but I have to write more here so I can submit this post. I hope this is enough;-)):

employeeid | clientid | under_over_1
------------+----------+--------------
          1 |        3 |         0.54
          1 |        4 |         0.47
          1 |        6 |         0.45
          1 |        7 |         0.59
          . |        . |           .
          . |        . |           .
          2 |        2 |         1.01
          2 |        3 |         0.21
          2 |        4 |         0.71
          2 |        6 |         0.68
          . |        . |           .
          . |        . |           .
          3 |        4 |         0.70
          3 |        6 |         0.54
          3 |        7 |         1.03
          3 |       11 |         0.74
          . |        . |           .
          . |        . |           .

I tried UNION ALL as in

SELECT DISTINCT employeeid, work.clientid, ROUND ((CAST (AVG(current_lawn_price) AS numeric) / CAST (AVG((((EXTRACT(HOUR FROM job_finish)*60) + EXTRACT(MIN FROM job_finish))) - ((EXTRACT(HOUR FROM job_start)*60) + EXTRACT(MIN FROM job_start))) AS numeric)) / 1, 2) AS under_over_1
FROM work
JOIN timesheet USING (date_linkid)
JOIN client USING (clientid)
WHERE employeeid = 1 AND workid < 557 AND workid > 188
GROUP BY employeeid, clientid ORDER BY clientid ASC

UNION ALL

SELECT DISTINCT employeeid, work.clientid, ROUND ((CAST (AVG(current_lawn_price) AS numeric) / CAST (AVG((((EXTRACT(HOUR FROM job_finish)*60) + EXTRACT(MIN FROM job_finish))) - ((EXTRACT(HOUR FROM job_start)*60) + EXTRACT(MIN FROM job_start))) AS numeric)) / 1.31666666666667, 2) AS under_over_1
FROM work
JOIN timesheet USING (date_linkid)
JOIN client USING (clientid)
WHERE employeeid = 3
GROUP BY employeeid, clientid ORDER BY clientid ASC

UNION ALL

SELECT DISTINCT employeeid, work.clientid, ROUND ((CAST (AVG(current_lawn_price) AS numeric) / CAST (AVG((((EXTRACT(HOUR FROM job_finish)*60) + EXTRACT(MIN FROM job_finish))) - ((EXTRACT(HOUR FROM job_start)*60) + EXTRACT(MIN FROM job_start))) AS numeric)) / 1.31666666666667, 2) AS under_over_1
FROM work
JOIN timesheet USING (date_linkid)
JOIN client USING (clientid)
WHERE employeeid = 2
GROUP BY employeeid, clientid ORDER BY clientid ASC;

However, I get the following error:

ERROR:  syntax error at or near "UNION"
LINE 7: UNION ALL

I am not sure why this is wrong or whether UNION ALL is the right approach here. Anyone have a clue?


回答1:


The immediate cause for the error is, quoting the manual:

(ORDER BY and LIMIT can be attached to a subexpression if it is enclosed in parentheses. Without parentheses, these clauses will be taken to apply to the result of the UNION, not to its right-hand input expression.)

Bold emphasis mine.

So, as suggested by @wilx, enclosing each SELECT in parentheses would fix that.

But there is more.

Merge into single query

SELECT employeeid, work.clientid    -- no DISTINCT
     , ROUND ((AVG(current_lawn_price)
             / AVG(extract(epoch FROM job_finish - job_start)))::numeric / 79, 2) AS under_over_1
FROM   work
JOIN   timesheet USING (date_linkid)
JOIN   client    USING (clientid)
WHERE (employeeid IN (1, 2) OR
       employeeid = 3
   AND workid < 557
   AND workid > 188)
GROUP  BY employeeid, clientid 
ORDER  BY CASE employeeid
            WHEN 1 THEN 1
            WHEN 2 THEN 3
            WHEN 3 THEN 2
          END
        , clientid;

Merge the 3 SELECT queries.

Remove the redundant DISTINCT. That's an expensive no-op after GROUP BY.

Instead of extracting hours and minutes from both begin and end timestamps etc., compute the interval by plain subtraction (works with timestamp or time values alike) and extract the epoch from it. Gives you the number of seconds. Divide by 60 and you got the number of minutes much faster. 79 being the result of 60 * 1.31666666666667, accordingly.

The manual about extracting epoch:

For timestamp with time zone values, the number of seconds since 1970-01-01 00:00:00 UTC (can be negative); for date and timestamp values, the number of seconds since 1970-01-01 00:00:00 local time; for interval values, the total number of seconds in the interval

Bold emphasis mine.

Since this removes UNION ALL, the parentheses mentioned at the top are not needed any more.

The CASE expression makes up for the mixed order in employeeid, like sticky bit provided.

If queries cannot be merged

If, for some reason, you can't or won't merge the three original SELECT queries, do this instead:

(  -- parentheses required
SELECT employeeid, work.clientid    -- no DISTINCT !
     , ROUND ((AVG(current_lawn_price)
             / AVG(extract(epoch FROM job_finish - job_start)))::numeric / 79, 2) AS under_over_1
FROM   work
JOIN   timesheet USING (date_linkid)
JOIN   client    USING (clientid)
WHERE  employeeid = 1
AND    workid < 557 
AND    workid > 188
GROUP  BY clientid  -- no need to GROUP BY employeeid while filtering single value
ORDER  BY clientid
)

UNION ALL
(
SELECT employeeid, work.clientid
     , ROUND ((AVG(current_lawn_price)
             / AVG(extract(epoch FROM job_finish - job_start)))::numeric / 79, 2) AS under_over_1
FROM   work
JOIN   timesheet USING (date_linkid)
JOIN   client    USING (clientid)
WHERE  employeeid = 3
GROUP  BY clientid
ORDER  BY clientid
)

UNION ALL
(
SELECT employeeid, work.clientid
     , ROUND ((AVG(current_lawn_price)
             / AVG(extract(epoch FROM job_finish - job_start)))::numeric / 79, 2) AS under_over_1
FROM   work
JOIN   timesheet USING (date_linkid)
JOIN   client    USING (clientid)
WHERE  employeeid = 2
GROUP  BY clientid
ORDER  BY clientid
);
-- no outer ORDER BY required

Keep ORDER BY per SELECT and add parentheses to fix the syntax. UNION ALL (as opposed to UNION) simply appends results preserving the order of individual SELECTs. This should be cheaper than ordering the whole set after UNION ALL. And you did want to keep using queries individual SELECT queries "as is" ...

Most of the other advice above applies accordingly.

Aside: make it a habit to use table aliases and table-qualify all columns in queries joining multiple tables. Much more robust against later changes and easier to read / debug.




回答2:


Remove all the ORDER BYs. Then use the resulting query as a derived table. To retain the order you presumably want, you can then use a CASE expression to map the employeeid to and integer determining the order:

SELECT x.employeeid,
       x.clientid,
       x.under_over_1
       FROM (<your UNION ALL query without the ORDER BYs>) x
       ORDER BY CASE x.employeeid
                  WHEN 1 THEN
                    1
                  WHEN 2 THEN
                    3
                  WHEN 3 THEN
                    2
                END,
                x.clientid;


来源:https://stackoverflow.com/questions/59348755/combining-3-select-statements-to-output-1-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!