Find unique record for duplicate records over two columns

问题

This is my database dataset :

Table

ID             Name

XXX-23305      Edward, Stue^Jenna
XXX-23305      Edward, Stue^Jenna
XXX-23306      Cole, Slaw^Bali
XXX-23306      Cole, Slaw^Bali
XXX-23306      Cole, Slaw^Bali
XXX-23310      Zerg, War^Finja
XXX-23310      Road^Sieglinde
XXX-23319      Gras, Dr.Carl^Yolo
XXX-23319      Gras, Dr.Carl^Yolo

As you can see there might be multiple entries for the same ID and Name combination. However in case of ID XXX-23310 there are two different names available.

Now what I want is displaying that exact dataset ( I have a database of like 31k entries with an unnknown amount of those entries ).

Is there a way to achieve that? I googled for some time and also had a look at other posts here but was unable to find anything.

EDIT1 : Desired output of the query:

ID             Name

XXX-23310      Zerg, War^Finja
XXX-23310      Road^Sieglinde

EDIT2:

not sure if anyone reads this but thanks for the solutions. I want to add another condition. Only output the data where there are more than one entry for the ID. E.g. if my sample dataset would just contain one entry for XXX-23310 then the query would print nothing as a result.

回答1:

If you also want to exclude any lone records for a given ID:

create table #data
(
    id varchar(10),
    [name] varchar(50)
)

insert into #data
values
    ('XXX-23305','Edward, Stue^Jenna'),
    ('XXX-23306','Cole, Slaw^Bali'),
    ('XXX-23306','Cole, Slaw^Bali'),
    ('XXX-23306','Cole, Slaw^Bali'),
    ('XXX-23310','Zerg, War^Finja'),
    ('XXX-23310','Road^Sieglinde'),
    ('XXX-23319','Gras, Dr.Carl^Yolo'),
    ('XXX-23319','Gras, Dr.Carl^Yolo');

with d as
(   
    select distinct
        id,
        [name]
    from #data
)
select *
from d
where d.id in
(   
    select d.id
    from d
    group by d.id
    having count(*) > 1
)

drop table #data

Returns the two record for XXX-23310, but not XXX-23305 as other answers would do:

/-----------------------------\
| id        | name            |
|-----------|-----------------|
| XXX-23310 | Road^Sieglinde  |
| XXX-23310 | Zerg, War^Finja |
\-----------------------------/

回答2:

use aggregation function in filter

As you want to consider only multiple id ,so use below one

select id,name from @your_table t
group by id,name
having count(*)=1
and exists (select 1 from @your_table t1 where t1.id=t.id
                          group by t1.id having count(*)>1
                           )

DEMO in fiddle

output

id            name
XXX-23310   Road^Sieglinde
XXX-23310   Zerg, War^Finja

回答3:

Try this:

DECLARE @DataSource TABLE
(
    [ID] VARCHAR(32)
   ,[Name] VARCHAR(25)
);

INSERT INTO @DataSource ([ID], [Name])
VALUES ('XXX-23305', 'Edward, Stue^Jenna')
      ,('XXX-23305', 'Edward, Stue^Jenna')
      ,('XXX-23306', 'Cole, Slaw^Bali')
      ,('XXX-23306', 'Cole, Slaw^Bali')
      ,('XXX-23306', 'Cole, Slaw^Bali')
      ,('XXX-23310', 'Zerg, War^Finja')
      ,('XXX-23310', 'Road^Sieglinde')
      ,('XXX-23319', 'Gras, Dr.Carl^Yolo')
      ,('XXX-23319', 'Gras, Dr.Carl^Yolo');

WITH DataSource AS
(
    SELECT [ID]
          ,[Name]
          ,COUNT(*) OVER (PARTITION BY [ID], [Name]) AS [ID_Name_Count]
          ,COUNT(*) OVER (PARTITION BY [ID]) AS [ID_Count]
    FROM @DataSource
)
SELECT [ID]
      ,[Name]
FROM DataSource
WHERE [ID_Name_Count] = 1
    AND [ID_Count] = 2;

来源：https://stackoverflow.com/questions/52966853/find-unique-record-for-duplicate-records-over-two-columns

标签

sql

sql-server

tsql

duplicates

unique