问题
I understand how GROUP BY
works and I also understand why my query does not bring the results I am expecting. However, what would be the best way to eliminate duplicates in this case?
Let's say we have the following tables:
City
Id Name
---------------------
1 Seattle
2 Los Angeles
3 San Francisco
Person
Id Name CityId
----------------------------
1 John Smith 1
2 Peter Taylor 1
3 Kate Elliot 1
4 Bruno Davis 2
5 Jack Brown 2
6 Bob Stewart 2
7 Tom Walker 3
8 Andrew Garcia 3
9 Kate Bauer 3
I want to retrieve a list of all cities and just one person that lives in each city.
Using GROUP BY
:
SELECT c.Id, c.Name as PersonName, p.Name as CityName
FROM City c
INNER JOIN Person p ON p.CityId = c.Id
GROUP BY c.Name, p.Name
Result:
Id PersonName CityName
----------------------------
1 John Smith Seattle
1 Peter Taylor Seattle
1 Kate Elliot Seattle
2 Bruno Davis Los Angeles
2 Jack Brown Los Angeles
2 Bob Stewart Los Angeles
3 Tom Walker San Francisco
3 Andrew Garcia San Francisco
3 Kate Bauer San Francisco
Using DISTINCT
:
SELECT DISTINCT c.Id, c.Name as PersonName, p.Name as CityName
FROM City c
INNER JOIN Person p ON p.CityId = c.Id
Same result.
Just to be very clear. This is the expected result:
Id PersonName CityName
----------------------------
1 John Smith Seattle
2 Bruno Davis Los Angeles
3 Tom Walker San Francisco
Would subquery
be the only solution for this case?
回答1:
Here is a solution which uses a subquery to identify the "first match" from the Person
table, which I have interpreted to mean the person with the lowest id value in each city group.
SELECT t1.Id,
t1.Name AS PersonName,
t2.Name AS CityName
FROM Person t1
INNER JOIN City t2
ON t1.CityId = t2.Id
INNER JOIN
(
SELECT CityId, MIN(Id) AS minId
FROM Person
GROUP BY CityId
) t3
ON t1.CityId = t3.CityId AND t1.Id = t3.minID
There is probably also a way to do this with window functions.
回答2:
A Partition
By City
and a Sub-Query
should do the trick:
SELECT R.ID, R.PERSON_NAME, R.CITY_NAME FROM
(
SELECT P.ID, P.NAME [PERSON_NAME], C.NAME [CITY_NAME],
ROW_NUMBER() OVER (PARTITION BY C.ID ORDER BY P.ID) AS rn
FROM Person P
INNER JOIN CITY C
ON P.CITYID = C.ID
) R
WHERE R.rn = 1
Result:
1 John Smith Seattle
4 Bruno Davis Los Angeles
7 Tom Walker San Francisco
回答3:
If above not working thant try distinct,
SELECT tbl.Id,
tbl.PersonName,
tbl.CityName
FROM
(
SELECT c.Id, c.Name as PersonName, p.Name as CityName
FROM City c
INNER JOIN Person p ON p.CityId = c.Id
ORDER BY c.Name, p.Name
) AS tbl
GROUP BY tbl.PersonName
edited
Here is query,
SELECT DISTINCT c.Id, c.Name as PersonName, p.Name as CityName
FROM City c
INNER JOIN Person p ON p.CityId = c.Id
ORDER BY c.Name, p.Name
来源:https://stackoverflow.com/questions/38495465/how-to-group-by-multiple-columns-in-sql-server