These are rather basic statements. I have a list of graphics which are linked to items in another table. I want to check how many of the graphics are not in use and can theoretically be deleted.
So first I used the NOT IN clause:
SELECT [GraphicNr]
,[Graphicfile]
FROM [dbo].[Graphic]
WHERE graphicnr NOT IN (SELECT graphicnr FROM dbo.Komp)
Which gave zero results, which seemed weird to me. After rewriting it to a NOT EXISTS, I got about 600 results:
SELECT [GraphicNr]
,[Graphicfile]
FROM [dbo].[Graphic] a
WHERE NOT EXISTS (SELECT graphicnr FROM dbo.komp b WHERE a.GraphicNr = b.GraphicNr)
So I guess I don't really have a problem, since the second statement works, but to my understanding, shouldn't the first one give the same results?
NOT IN
with a subquery has strange behavior. If any row in the subquery returns a NULL
value, then no rows are returned. This is due to following the strict semantics of NULL
(which means: "I don't know if they are equal").
NOT EXISTS
behaves as you would expect. For this reason, I recommend never using NOT IN
with a subquery. Always use NOT EXISTS
.
That because of NULL
value returned from subquery
:
SELECT [GraphicNr], [Graphicfile]
FROM [dbo].[Graphic]
WHERE graphicnr NOT IN (SELECT graphicnr FROM dbo.Komp)
This would produce no records
or no rows affected
because of graphicnr not in (null)
which is not desired output.
So, the NOT EXISTS
would not work as the way the IN
clause or NOT IN
work. It behaves differently then IN
or NOT IN
clause.
However, you can prevent this by using IS NOT NULL
filter in subquery
. But the recommended way is to use NOT EXISTS
instead.
This query produces the expected result:
SELECT *
FROM (SELECT 1 UNION ALL SELECT 2) AS tbl(col)
WHERE col IN (NULL, 1)
-- returns first row
But adding a NOT
does not invert the results:
SELECT *
FROM (SELECT 1 UNION ALL SELECT 2) AS tbl(col)
WHERE NOT col IN (NULL, 1)
-- returns zero rows
This is because the above query is roughly equivalent to the following:
SELECT *
FROM (SELECT 1 UNION ALL SELECT 2) AS tbl(col)
WHERE NOT (col = NULL OR col = 1)
And here is how the where clause is evaluated:
| col | col = NULL (1) | col = 1 | col = NULL OR col = 1 | NOT (col = NULL OR col = 1) |
|-----|----------------|---------|-----------------------|-----------------------------|
| 1 | UNKNOWN | TRUE | TRUE | FALSE |
| 2 | UNKNOWN | FALSE | UNKNOWN (2) | UNKNOWN (3) |
Notice that:
- The comparison involving
NULL
yieldsUNKNOWN
- The
OR
expression where none of the operands areTRUE
and at least one operand isUNKNOWN
yieldsUNKNOWN
(ref) - The
NOT
ofUNKNOWN
yieldsUNKNOWN
(ref)
You can extend the above example to more than two values (e.g. NULL, 1 and 2) but the result will be same: if one of the values is NULL
then no row will match.
TLDR: it basically boils down to the three-valued logic used in SQL. To conquer SQL you need to master it. If you cannot, just follow the use NOT EXISTS
suggestion.
来源:https://stackoverflow.com/questions/52039152/not-in-does-not-produce-same-results-as-not-exists