How SQL's DISTINCT clause works?

江枫思渺然 提交于 2019-12-30 06:27:06

问题


I'm looking for the answer on how DISTINCT clause works in SQL (SQL Server 2008 if that makes a difference) on a query with multiple tables joined?

I mean how the SQL engine handles the query with DISTINCT clause?

The reason I'm asking is that I was told by my far more experienced colleague that SQL applies DISTINCT to every field of every table. It seems unlikely for me, but I want to make sure....

For example having two tables:

CREATE TABLE users
(
u_id INT PRIMARY KEY,
u_name VARCHAR(30),
u_password VARCHAR(30)
)

CREATE TABLE roles
(
r_id INT PRIMARY KEY,
r_name VARCHAR(30)
)

CREATE TABLE users_l_roles
(
u_id INT FOREIGN KEY REFERENCES users(u_id) ,
r_id INT FOREIGN KEY REFERENCES roles(r_id) 
)

And then having this query:

SELECT          u_name
FROM            users 
INNER JOIN      users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN      roles ON users_l_roles.r_id = roles.r_id 

Assuming there was user with two roles then the above query will return two records with the same user name.

But this query with distinct:

SELECT DISTINCT u_name
FROM            users 
INNER JOIN      users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN      roles ON users_l_roles.r_id = roles.r_id 

will return only one user name.

The question is whether SQL will compare all the fields from all the joined tables (u_id, u_name, u_password, r_id, r_name) or it will compare only named fields in the query (u_name) and distinct the results?


回答1:


DISTINCT filters out duplicate values of your returned fields.

A really simplified way to look at it is:

  • It builds your overall result set (including duplicates) based on your FROM and WHERE clauses
  • It sorts that result set based on the fields you want to return
  • It removes any duplicate values in those fields

It's semantically equivalent to a GROUP BY where all returned fields are in the GROUP BY clause.




回答2:


DISTINCT simply de-duplicates the resultant recordset after all other query operations have been performed. This article has more detail.




回答3:


First selects all the 'available records' and then it 'removes duplicate records' in all 'available records' and prints.



来源:https://stackoverflow.com/questions/8992804/how-sqls-distinct-clause-works

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!