问题
I have a table with columns A, B and C. Column A might have duplicates.
I need a query that will get me a resultset with unique values in column A, and I don't care which possible duplicate it takes.
I don't know anything beforehand about the rest of the data.
An example might be:
A B C
1 8 8
1 7 7
2 10 10
In this case I'd want to select:
A B C
1 x x
2 10 10
x = It doesn't matter which value it would pick.
Kind regards,
Matthias Vance
Edit
I thought I found my solution with:
SELECT * FROM (
SELECT * FROM test GROUP BY a
) table_test;
But that wasn't working after all.
This will result in:
[Microsoft][ODBC Excel Driver] Cannot group on fields selected with '*'
回答1:
Try this:
select A, B, C
from test x
where not exists (select *
from test y
where y.A = x.A
and (y.B < x.B or (y.B = x.B and y.C < x.C))
order by A
But since it contains correlated subquery it might be slow. (OTOH it is at least theoretically possible for database engine to optimize it into something I show below.)
What about something outside SQL? What are you going to do with result?
If you are going to process it with some program, why not just get:
select A, B, C from test order by A, B, C
and then do something like:
prev_a = None
for a, b, c in get_query_result():
if a != prev_a:
prev_a = a
yield (a, b, c)
in your application?
I don't know PHP but I guess it would be something like this:
$query = "SELECT a,b,c FROM test ORDER BY a,b,c";
$result = odbc_exec($connect, $query);
$prev_a = NULL; # I don't know what you would normally use here in PHP
while (odbc_fetch_row($result)) {
$a = odbc_result($result, 1);
if (is_null($prev_a) or $a != $prev_a) {
$b = odbc_result($result, 2);
$c = odbc_result($result, 3);
print("A = $a, B = $b, C = $c\n");
$prev_a = $a;
}
}
回答2:
Wouldn't this simple query work:
SELECT A, MIN(B), MIN(C) FROM test GROUP BY A
It groups by A and just selects the minimum values of B and C in the rows of A. The values of B and C might come from different rows, e.g.
A B C
1 2 3
1 4 1
would return
A B C
1 2 1
回答3:
The hard part is getting b
and c
from the same row. The following query uses a subquery to eliminate rows which do not have the lowest value for either b
or c
. It joins the table on itself, and says there can not be rows with a lower value of b
or c
. The "not" is implemented by the prev.a is null
in the WHERE clause.
The subquery is called semiunique
because there can still be duplicate rows with identical b
and c
. The outer query takes care of those with a GROUP BY. Since b
and c
are identical, it doesn't matter which row we choose, so we can pick one using min()
.
select a, min(b), min(c)
from (
select cur.a, cur.b, cur.c
from YourTable cur
left outer join YourTable prev
on cur.a = prev.a
and (cur.b > prev.b
or (cur.b = prev.b and cur.c > prev.c))
where prev.a is null
) semiunique
group by semiunique.a
Per your comment, a much simpler version to grab "something" for b
and c
:
select a, min(b), min(c)
from YourTable
group by a
回答4:
This works in SQL Server 2008, which illustrates the concept. You need a unique column.
declare @temp as table (
id int identity(1,1),
a int,
b int,
c int)
insert into @temp
select 1 as A, 8 as B, 8 as C
union
select 1, 7, 7
union
select 2, 10, 10
select a, b, c from @temp
where id in (select MAX(id) from @temp
group by a)
Seeing that you're using Excel, I'd use the same principle. Add another column to the spreadsheet and make sure it is unique. Use that column as your ID-column.
回答5:
Select A
, Max(b) //Since You don't care about the Value
, Max(c) //Since You don't care about the Value
From table t
Group By A
回答6:
-- All rows that are unique in column A
select *
from table
where col_a in (select col_a from table group by col_a having count(*)=1)
-- One row per dupe
select *
from table
where col_a in (select max(col_a) from table group by col_a having count(*)>1)
回答7:
All rows that have an unique value in A
SELECT * FROM table t1 INNER JOIN
(SELECT A FROM table GROUP BY A HAVING COUNT(A) = 1) as t2
ON t1.A = t2.A
I don't understand what you mean with "One of the rows that has one of the duplicate values in A". Could you explain it a little better?
Using your example, in MySQL just doing
SELECT * FROM table GROUP BY A
will gave you the desired result:
A B C
1 8 8
2 10 10
回答8:
Another option would be to use the ROW_NUMBER()-function. Not sure if it's valid in the ODBC Excel driver though:
select a, b, c from (
select *
, ROW_NUMBER() OVER (PARTITION BY A ORDER BY A) as RN
from @temp
) q where rn = 1
回答9:
I know this is a dirty way, but will work this case.
Pseudo code:
create table #tmpStaging with primary key as col( A )
for each row in the flatFile/excel/whatever begin begin try insert into #tmpstaging end try
begin catch --do nothing end catch end
select * from #tmpstaging will give you rows without dups
回答10:
This will give you the first of every duplicate
SELECT DISTINCT
A,
(SELECT TOP 1 B FROM @Table tB WHERE tb.A = t.A) B,
(SELECT TOP 1 C FROM @Table tB WHERE tb.A = t.A) C
FROM @Table t
回答11:
Try this,
SELECT UT.[A],
(SELECT TOP 1 B FROM [YourTable] WHERE [YourTable].A= UT.A) AS B,
(SELECT TOP 1 C FROM [YourTable] WHERE [YourTable].A= UT.A) AS C FROM [YourTable] AS UT GROUP BY UT.[A]
I didn't tried it yet... who knows :)
回答12:
select *
from table T
where id = (
select min(id) from table where a = T.a
)
UPD. But if there is no primary key in your table (why?), then:
select A, min(B), min(C)
from TABLE
group by A
来源:https://stackoverflow.com/questions/1769931/select-uniques-and-one-of-the-doubles