Oracle SQL Group By if

五迷三道 提交于 2019-12-24 16:58:56

问题


In my application I log the file opening in the following table:

TESTID        SITE       LATEST_READ READ_COUNT FILE_ORIGIN_ID
------------- ---------- ----------- ---------- --------------
File1        |Site1     |02/05/13   |         2|             1 
File1        |Site2     |22/01/14   |         3|             2 
File2        |Site1     |02/06/14   |         8|             0 
File3        |Site1     |19/09/14   |        17|             0 
File4        |Site2     |19/09/14   |        14|             2 
File4        |Site2     |19/09/14   |        34|             1  
File4        |Site3     |19/09/14   |        10|             0 
File5        |Site2     |19/09/14   |        44|             2  
File5        |Site3     |19/09/14   |         1|             2 

I want to get the sum of the read count per file if at least one of the FILE_ORIGIN_ID for the file is different from 2.

This example should give:

TESTID        SITE       LATEST_READ SUM        FILE_ORIGIN_ID
------------- ---------- ----------- ---------- --------------
File1        |Site1     |02/05/13   |         5|             1 
File2        |Site1     |02/06/14   |         8|             0 
File3        |Site1     |19/09/14   |        17|             0 
File4        |Site2     |19/09/14   |        58|             X <-- can be 0 or 1 
File5        |Site2     |19/09/14   |        44|             2  
File5        |Site3     |19/09/14   |         1|             2 

I've tried with the following:

SELECT TESTID, SUM(READ_COUNT), LATEST_READ, FILE_ORIGIN_ID, site
FROM FILE_USAGE_LOG 
GROUP BY TESTID, TESTID, LATEST_READ, 
          CASE 
            WHEN FILE_ORIGIN_ID <> '2' Then 1
            ELSE 0
          END, site
ORDER BY TESTID;

But it's not doing what I want to do... How can I improve this? And how can I, in case of grouped lines set the FILE_ORIGIN_ID to 0 or 1


回答1:


SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE FILE_USAGE_LOG (TESTID, SITE, LATEST_READ, READ_COUNT, FILE_ORIGIN_ID ) AS
          SELECT 'File1', 'Site1', DATE '2013-05-02', 2, 1 FROM DUAL
UNION ALL SELECT 'File1', 'Site2', DATE '2014-01-22', 3, 2 FROM DUAL
UNION ALL SELECT 'File2', 'Site1', DATE '2014-06-02', 8, 0 FROM DUAL
UNION ALL SELECT 'File3', 'Site1', DATE '2014-09-19', 17, 0 FROM DUAL
UNION ALL SELECT 'File4', 'Site2', DATE '2014-09-19', 14, 2 FROM DUAL
UNION ALL SELECT 'File4', 'Site2', DATE '2014-09-19', 34, 1 FROM DUAL
UNION ALL SELECT 'File4', 'Site3', DATE '2014-09-19', 10, 0 FROM DUAL
UNION ALL SELECT 'File5', 'Site2', DATE '2014-09-19', 44, 2 FROM DUAL
UNION ALL SELECT 'File5', 'Site3', DATE '2014-09-19', 1, 2 FROM DUAL;

Query 1:

SELECT  TESTID,
        REGEXP_REPLACE( 
          LISTAGG( SITE, ', ' )
            WITHIN GROUP( ORDER BY SITE ),
          '([^, ]+)(, \1)+($|, )',
          '\1\3'
        ) AS SITES, 
        MAX( LATEST_READ ) AS LATEST_READ,
        SUM(READ_COUNT) AS Total_Read_Count
FROM    FILE_USAGE_LOG 
GROUP BY
        TESTID
HAVING  COUNT( CASE FILE_ORIGIN_ID WHEN 2 THEN NULL ELSE 1 END ) > 0
UNION ALL
SELECT  TESTID,
        SITE,
        LATEST_READ,
        READ_COUNT
FROM    FILE_USAGE_LOG l
WHERE   FILE_ORIGIN_ID = 2
AND     NOT EXISTS ( SELECT 'X'
                     FROM   FILE_USAGE_LOG x
                     WHERE  x.TESTID      = l.TESTID
                     AND    x.FILE_ORIGIN_ID <> 2
                   )
ORDER BY 1,2

Results:

| TESTID |        SITES |                 LATEST_READ | TOTAL_READ_COUNT |
|--------|--------------|-----------------------------|------------------|
|  File1 | Site1, Site2 |   January, 22 2014 00:00:00 |                5 |
|  File2 |        Site1 |      June, 02 2014 00:00:00 |                8 |
|  File3 |        Site1 | September, 19 2014 00:00:00 |               17 |
|  File4 | Site2, Site3 | September, 19 2014 00:00:00 |               58 |
|  File5 |        Site2 | September, 19 2014 00:00:00 |               44 |
|  File5 |        Site3 | September, 19 2014 00:00:00 |                1 |

Query 2:

SELECT  TESTID,
        REGEXP_REPLACE( 
          LISTAGG( SITE, ', ' )
            WITHIN GROUP( ORDER BY SITE ),
          '([^, ]+)(, \1)+($|, )',
          '\1\3'
        ) AS SITES, 
        MAX( LATEST_READ ) AS LATEST_READ,
        SUM(READ_COUNT) AS Total_Read_Count
FROM    FILE_USAGE_LOG 
WHERE   TESTID NOT LIKE 'this%'
AND     LATEST_READ BETWEEN DATE '2014-01-01' AND DATE '2014-12-31'
GROUP BY
        TESTID
HAVING  COUNT( CASE FILE_ORIGIN_ID WHEN 2 THEN NULL ELSE 1 END ) > 0
UNION ALL
SELECT  TESTID,
        SITE,
        LATEST_READ,
        READ_COUNT
FROM    FILE_USAGE_LOG l
WHERE   FILE_ORIGIN_ID = 2
AND     NOT EXISTS ( SELECT 'X'
                     FROM   FILE_USAGE_LOG x
                     WHERE  x.TESTID      = l.TESTID
                     AND    x.FILE_ORIGIN_ID <> 2
                     AND    TESTID NOT LIKE 'this%'
                     AND    LATEST_READ BETWEEN DATE '2014-01-01' AND DATE '2014-12-31'
                   )
AND     TESTID NOT LIKE 'this%'
AND     LATEST_READ BETWEEN DATE '2014-01-01' AND DATE '2014-12-31'
ORDER BY 1,2

Results:

| TESTID |        SITES |                 LATEST_READ | TOTAL_READ_COUNT |
|--------|--------------|-----------------------------|------------------|
|  File1 |        Site2 |   January, 22 2014 00:00:00 |                3 |
|  File2 |        Site1 |      June, 02 2014 00:00:00 |                8 |
|  File3 |        Site1 | September, 19 2014 00:00:00 |               17 |
|  File4 | Site2, Site3 | September, 19 2014 00:00:00 |               58 |
|  File5 |        Site2 | September, 19 2014 00:00:00 |               44 |
|  File5 |        Site3 | September, 19 2014 00:00:00 |                1 |



回答2:


For now I have a partial result that will return you the TESTID and READ_COUNT in the format you need:

select testid, read_count FROM
(SELECT testid, sum(read_count) as  read_count
FROM FILE_USAGE_LOG
where testid in (select distinct testid from FILE_USAGE_LOG
                 where not file_origin_id = 2)
group by testid)
UNION
(select testid, read_count
 FROM FILE_USAGE_LOG
 where testid not in (select distinct testid from FILE_USAGE_LOG
                 where not file_origin_id = 2))
ORDER BY testid

This is not the exact result you wanted because grouping on the other fields would give you a different result, however if you want to get any other data apart from the testid (which we are grouping on) you will need to put these in an aggregate function EDIT: Added the different values (randomly as min or max as I saw fit)

select testid,site, read_count,latest_read,file_origin_id, grouped FROM 
(SELECT testid, MIN(site) as site,  sum(read_count) as  read_count
 , max(latest_read) as latest_read, min(file_origin_id) as file_origin_id
 ,'true' as grouped
FROM mytable
where testid in (select distinct testid from mytable
                 where not file_origin_id = 2)
group by testid)
UNION
(select testid, site, read_count, latest_read, file_origin_id, 'false' as grouped
 FROM mytable
 where testid not in (select distinct testid from mytable
                 where not file_origin_id = 2))
ORDER BY testid

FIDDLE



来源:https://stackoverflow.com/questions/30709480/oracle-sql-group-by-if

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!