T-SQL GROUP BY: Best way to include other grouped columns

烂漫一生 提交于 2019-12-18 11:28:48

问题


I'm a MySQL user who is trying to port some things over to MS SQL Server.

I'm joining a couple of tables, and aggregating some of the columns via GROUP BY.

A simple example would be employees and projects:

select empID, fname, lname, title, dept, count(projectID)
from employees E left join projects P on E.empID = P.projLeader
group by empID

...that would work in MySQL, but MS SQL is stricter and requires that everything is either enclosed in an aggregate function or is part of the GROUP BY clause.

So, of course, in this simple example, I assume I could just include the extra columns in the group by clause. But the actual query I'm dealing with is pretty complicated, and includes a bunch of operations performed on some of the non-aggregated columns... i.e., it would get REALLY ugly to try to include all of them in the group by clause.

So is there a better way to do this?


回答1:


You can get it to work with something around these lines:

select e.empID, fname, lname, title, dept, projectIDCount
from
(
   select empID, count(projectID) as projectIDCount
   from employees E left join projects P on E.empID = P.projLeader
   group by empID
) idList
inner join employees e on idList.empID = e.empID

This way you avoid the extra group by operations, and you can get any data you want. Also you have a better chance to make good use of indexes on some scenarios (if you are not returning the full info), and can be better combined with paging.




回答2:


"it would get REALLY ugly to try to include all of them in the group by clause."

Yup - that's the only way to do it * - just copy and paste the non-aggregated columns into the group by clause, remove the aliases and that's as good as it gets...

*you could wrap it in a nested SELECT but that's probably just as ugly...




回答3:


MySQL is unusual - and technically not compliant with the SQL standard - in allowing you to omit items from the GROUP BY clause. In standard SQL, each non-aggregate column in the select-list must be listed in full in the GROUP BY clause (either by name or by ordinal number, but that is deprecated).

(Oh, although MySQL is unusual, it is nice that it allows the shorthand.)




回答4:


You do not need join in the subquery as it not necessary to make group by based on empID from employees - you can do it on projectLeader field from projects.

With the inner join (as I put) you'll get list of employees that have at least one project. If you want list of all employees just change it to left join

  select e.empID, e.fname, e.lname, e.title, e.dept, p.projectIDCount
    from employees e 
   inner join ( select projLeader, count(*) as projectIDCount
                  from projects
                 group by projLeader
              ) p on p.projLeader = e.empID



回答5:


A subquery in the select clause might also be suitable. It would work for the example given but might not for the actual complicated query you are dealing with.

select
        e.empID, fname, lname, title, dept
        , (select count(*) from projects p where p.projLeader = e.empId) as projectCount
from
   from employees E


来源:https://stackoverflow.com/questions/626788/t-sql-group-by-best-way-to-include-other-grouped-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!