问题
I have the following tables; format: table_name[column1, column2, etc..]
VENDOR_ORDERS [ORDER_ID, ORDER_CREATION_DATETIME, REGION_ID, ZIP_CODE, AMOUNT]
CALENDAR [CALENDAR_WEEK, CALENDAR_DATE]
basically what i'm trying to achieve is writing a query that will give me:
the COUNT(ORDER_ID)
and SUM(AMOUNT)
per CALENDAR_WEEK
for every REGION_ID
and DISTINCT(ZIP_CODE)
so the results should look something like this:
ZIP_CODE CALENDAR_WEEK REGION_ID COUNT(ORDER_ID) SUM(AMOUNT)
--------------------
XXXXX 01 1 50 987.45
YYYYY 01 1 25 568.32
ZZZZZ 01 1 30 555.63
MMMMM 01 1 10 099.93
XXXXX 15 1 05 999.34
YYYYY 15 1 32 339.67
ZZZZZ 15 1 21 457.23
MMMMM 15 1 88 459.99
i used the following code:
SELECT
DISTINCT(vo.ZIP_CODE)
,TO_CHAR(ca.CALENDAR_WEEK)
,TRUNC(vo.ORDER_CREATION_DATETIME) -- this column is not needed, i just added it for visualization purposes
,vo.REGION_ID
,COUNT(vo.ORDER_ID)
,SUM(vo.AMOUNT)
FROM
VENDOR_ORDERS vo
,CALENDAR ca
WHERE
TRUNC(vo.ORDER_CREATION_DATETIME) = sd.CALENDAR_DATE
AND vo.REGION_ID = 1
GROUP BY
vo.ZIP_CODE
,TO_CHAR(ca.CALENDAR_WEEK)
,vo.ORDER_CREATION_DATETIME
,vc.REGION_ID;
the problem is that i'm not getting DISTINCT(ZIP_CODE)
per CALENDAR_WEEK
, i'm having repeated ZIP_CODE
for the same CALENDAR_WEEK
, same REGION_ID
but different COUNT(ORDER_ID)
and SUM(AMOUNT)
i hope i made myself clear. thanks in advance for the help
回答1:
You misunderstand what distinct
is. It is not a function. It is a modifier on select
and it affects all columns being selected. So, it is behaving exactly as it should.
If you want aggregations by zip code and week, then those are the only two columns that should be in the group by
:
SELECT vo.ZIP_CODE, TO_CHAR(ca.CALENDAR_WEEK),
-- vo.REGION_ID
COUNT(vo.ORDER_ID),
SUM(vo.AMOUNT)
FROM VENDOR_ORDERS vo JOIN
CALENDAR ca
ON TRUNC(vo.ORDER_CREATION_DATETIME) = sd.CALENDAR_DATE
WHERE vo.REGION_ID = 1
GROUP BY vo.ZIP_CODE, TO_CHAR(ca.CALENDAR_WEEK)
You could probably include region_id
as well, assuming that each zip code is in one region.
回答2:
Your DISTINCT has no purpose in this query it will be applied to all columns and not to ORDER_ID only as you think. Think about this: if you have several ORDER_ID with different values for all other columns, how Oracle would know which one to return ??
Additionnaly it is useless to specify the DISTINCT because you are doing a GROUP BY which finally achieve the same results.
And last but not least, you're wrong when you say this in your comments:
-- this column is not needed, i just added it for visualization
You need it in your SELECT because it is an essential field of your GROUP BY
Without seing data sample I can't say it 100%, but your issue is probably due to the fact that in your select you make a TRUNC on your datetime field, and not in your GROUP BY clause. So it doesn't return what you want and you don't understand why because your select show you a truncated date, you think that the GROUP BY worked also on date, but its not the case, it grouped on DATE and TIME
To understand your issue, do:
SELECT
DISTINCT(vo.ZIP_CODE)
,TO_CHAR(ca.CALENDAR_WEEK)
,vo.ORDER_CREATION_DATETIME
,vo.REGION_ID
,COUNT(vo.ORDER_ID)
,SUM(vo.AMOUNT)
FROM
VENDOR_ORDERS vo
,CALENDAR ca
WHERE
TRUNC(vo.ORDER_CREATION_DATETIME) = sd.CALENDAR_DATE
AND vo.REGION_ID = 1
GROUP BY
vo.ZIP_CODE
,TO_CHAR(ca.CALENDAR_WEEK)
,vo.ORDER_CREATION_DATETIME
,vc.REGION_ID;
To fix your issue, do:
SELECT
DISTINCT(vo.ZIP_CODE)
,TO_CHAR(ca.CALENDAR_WEEK)
,TRUNC(vo.ORDER_CREATION_DATETIME)
,vo.REGION_ID
,COUNT(vo.ORDER_ID)
,SUM(vo.AMOUNT)
FROM
VENDOR_ORDERS vo
,CALENDAR ca
WHERE
TRUNC(vo.ORDER_CREATION_DATETIME) = sd.CALENDAR_DATE
AND vo.REGION_ID = 1
GROUP BY
vo.ZIP_CODE
,TO_CHAR(ca.CALENDAR_WEEK)
,TRUNC(vo.ORDER_CREATION_DATETIME)
,vc.REGION_ID;
来源:https://stackoverflow.com/questions/35868745/oracle-sql-select-distinct-not-removing-duplicates