How to group by multiple columns in SQL Server

核能气质少年 提交于 2020-01-06 15:11:21

问题


I understand how GROUP BY works and I also understand why my query does not bring the results I am expecting. However, what would be the best way to eliminate duplicates in this case?

Let's say we have the following tables:

City

Id    Name
---------------------
1     Seattle
2     Los Angeles
3     San Francisco

Person

Id    Name            CityId
----------------------------
1     John Smith      1
2     Peter Taylor    1
3     Kate Elliot     1
4     Bruno Davis     2
5     Jack Brown      2
6     Bob Stewart     2
7     Tom Walker      3
8     Andrew Garcia   3
9     Kate Bauer      3

I want to retrieve a list of all cities and just one person that lives in each city.

Using GROUP BY:

SELECT c.Id, c.Name as PersonName, p.Name as CityName
FROM City c
INNER JOIN Person p ON p.CityId = c.Id
GROUP BY c.Name, p.Name

Result:

Id    PersonName      CityName
----------------------------
1     John Smith      Seattle
1     Peter Taylor    Seattle
1     Kate Elliot     Seattle
2     Bruno Davis     Los Angeles
2     Jack Brown      Los Angeles
2     Bob Stewart     Los Angeles
3     Tom Walker      San Francisco
3     Andrew Garcia   San Francisco
3     Kate Bauer      San Francisco

Using DISTINCT:

SELECT DISTINCT c.Id, c.Name as PersonName, p.Name as CityName
FROM City c
INNER JOIN Person p ON p.CityId = c.Id

Same result.

Just to be very clear. This is the expected result:

Id    PersonName      CityName
----------------------------
1     John Smith      Seattle
2     Bruno Davis     Los Angeles
3     Tom Walker      San Francisco

Would subquery be the only solution for this case?


回答1:


Here is a solution which uses a subquery to identify the "first match" from the Person table, which I have interpreted to mean the person with the lowest id value in each city group.

SELECT t1.Id,
       t1.Name AS PersonName,
       t2.Name AS CityName
FROM Person t1
INNER JOIN City t2
    ON t1.CityId = t2.Id
INNER JOIN
(
    SELECT CityId, MIN(Id) AS minId
    FROM Person
    GROUP BY CityId
) t3
    ON t1.CityId = t3.CityId AND t1.Id = t3.minID

There is probably also a way to do this with window functions.




回答2:


A Partition By City and a Sub-Query should do the trick:

SELECT R.ID, R.PERSON_NAME, R.CITY_NAME FROM
(
    SELECT P.ID, P.NAME [PERSON_NAME], C.NAME [CITY_NAME],
             ROW_NUMBER() OVER (PARTITION BY C.ID ORDER BY P.ID) AS rn
    FROM Person P
    INNER JOIN CITY C
    ON P.CITYID = C.ID
) R
WHERE R.rn = 1

Result:

1       John Smith      Seattle
4       Bruno Davis     Los Angeles
7       Tom Walker      San Francisco



回答3:


If above not working thant try distinct,

SELECT tbl.Id,
       tbl.PersonName,
       tbl.CityName
FROM
(
   SELECT c.Id, c.Name as PersonName, p.Name as CityName
   FROM City c
   INNER JOIN Person p ON p.CityId = c.Id
   ORDER BY c.Name, p.Name
) AS tbl
GROUP BY tbl.PersonName

edited

Here is query,

   SELECT DISTINCT c.Id, c.Name as PersonName, p.Name as CityName
   FROM City c
   INNER JOIN Person p ON p.CityId = c.Id
   ORDER BY c.Name, p.Name


来源:https://stackoverflow.com/questions/38495465/how-to-group-by-multiple-columns-in-sql-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!