I'm attempting to build a query that will return all non duplicate (unique) records in a table. The query will need to use multiple fields to determine if the records are duplicate.
For example, if a table has the following fields; PKID, ClientID, Name, AcctNo, OrderDate, Charge, I'd like to use the AcctNo, OrderDate and Charge fields to find unique records.
Table
PKID-----ClientID-----Name-----AcctNo-----OrderDate-----Charge
1 JX100 John 12345 9/9/2010 $100.00
2 JX220 Mark 55567 9/9/2010 $23.00
3 JX690 Matt 89899 9/9/2010 $218.00
4 JX100 John 12345 9/9/2010 $100.00
The result of the query would need to be:
PKID-----ClientID-----Name-----AcctNo-----OrderDate-----Charge
2 JX220 Mark 55567 9/9/2010 $23.00
3 JX690 Matt 89899 9/9/2010 $218.00
I've tried using SELECT DISTINCT, but that doesn't work because it keeps one of the duplicate records in the result. I've also tried using HAVING COUNT = 1, but that returns all records.
Thanks for the help.
HAVING COUNT(*) = 1 will work if you only include the fields in the GROUP BY that you're using to find the unique records. (i.e. not PKID, but you can use MAX or MIN to return that since you'll only have one record per group in the results set.)
SELECT MAX(PKID) AS PKID ,
MAX(ClientID) AS ClientID,
MAX(Name) AS Name ,
AcctNo ,
OrderDate ,
Charge
FROM T
GROUP BY AcctNo ,
OrderDate,
Charge
HAVING COUNT(*) = 1
or
SELECT PKID ,
ClientID ,
Name ,
AcctNo ,
OrderDate ,
Charge
FROM YourTable t1
WHERE NOT EXISTS
(SELECT *
FROM YourTable t2
WHERE t1.PKID <> t2.PKID
AND t1.AcctNo = t2.AcctNo
AND t1.OrderDate = t2.OrderDate
AND t1.Charge = t2.Charge
)
Simply add:
GROUP BY AcctNo, OrderDate, Charge
HAVING COUNT(1) = 1
The GROUP BY groups all rows with the same AcctNo, OrderDate and Charge together,
then the HAVING COUNT(1) = 1 shows only the rows where there was just 1 progenitor.
Thanks kekekela for the nudge in the right direction.
Here's the query that produced the result I wanted:
SELECT AcctNo, OrderDate, Charge FROM Table1 GROUP BY AcctNo, OrderDate, Charge
HAVING (COUNT(AcctNo) = 1) AND (COUNT(OrderDate) = 1) AND (COUNT(Charge) = 1);
Or more simplified based on Gus's example:
SELECT AcctNo, OrderDate, Charge FROM Table1 GROUP BY AcctNo, OrderDate, Charge
HAVING COUNT(1) = 1;
You could just drop the PKID to return all records:
SELECT DISTINCT
ClientID
, Name
, AcctNo
, OrderDate
, Charge
FROM table;
Note:
This is slightly different from what you're asking.
It returns a unique set by removing the one non-unique field.
By your example, you're asking to return non-duplicates.
I could only see your example being useful if you're trying
to clean up a table by extracting the "good" records.
You could determine the non-unique records first, and then test for those records not in that set - like this
select * from mytable where pkid not in
(select t1.pkid
from mytable t1 inner join mytable t2
on t1.pkid <> t2.pkid
and t1.acctno = t2.acctno
and t1.orderdate = t2.orderdate
and t1.charge = t2.charge)
the last part of the inner query lets you fiddle with the criteria for "equality" - add the required number of columns to test. Of course, this gets a lot more interesting without that primary key :) In such cases I usually end up creating one
Ketil
SELECT GMPS.gen.ProductDetail.PaperType, GMPS.gen.ProductDetail.Size FROM
GMPS.gen.ProductDetail GROUP BY GMPS.gen.ProductDetail.PaperType,
GMPS.gen.ProductDetail.Size
HAVING COUNT(1) = 1;
来源:https://stackoverflow.com/questions/3686192/sql-query-for-non-duplicate-records