问题
I'm looking for the answer on how DISTINCT clause works in SQL (SQL Server 2008 if that makes a difference) on a query with multiple tables joined?
I mean how the SQL engine handles the query with DISTINCT clause?
The reason I'm asking is that I was told by my far more experienced colleague that SQL applies DISTINCT to every field of every table. It seems unlikely for me, but I want to make sure....
For example having two tables:
CREATE TABLE users
(
u_id INT PRIMARY KEY,
u_name VARCHAR(30),
u_password VARCHAR(30)
)
CREATE TABLE roles
(
r_id INT PRIMARY KEY,
r_name VARCHAR(30)
)
CREATE TABLE users_l_roles
(
u_id INT FOREIGN KEY REFERENCES users(u_id) ,
r_id INT FOREIGN KEY REFERENCES roles(r_id)
)
And then having this query:
SELECT u_name
FROM users
INNER JOIN users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN roles ON users_l_roles.r_id = roles.r_id
Assuming there was user with two roles then the above query will return two records with the same user name.
But this query with distinct:
SELECT DISTINCT u_name
FROM users
INNER JOIN users_l_roles ON users.u_id = users_l_roles.u_id
INNER JOIN roles ON users_l_roles.r_id = roles.r_id
will return only one user name.
The question is whether SQL will compare all the fields from all the joined tables (u_id, u_name, u_password, r_id, r_name) or it will compare only named fields in the query (u_name) and distinct the results?
回答1:
DISTINCT
filters out duplicate values of your returned fields.
A really simplified way to look at it is:
- It builds your overall result set (including duplicates) based on your
FROM
andWHERE
clauses - It sorts that result set based on the fields you want to return
- It removes any duplicate values in those fields
It's semantically equivalent to a GROUP BY
where all returned fields are in the GROUP BY
clause.
回答2:
DISTINCT
simply de-duplicates the resultant recordset after all other query operations have been performed. This article has more detail.
回答3:
First selects all the 'available records' and then it 'removes duplicate records' in all 'available records' and prints.
来源:https://stackoverflow.com/questions/8992804/how-sqls-distinct-clause-works