These are rather basic statements. I have a list of graphics which are linked to items in another table. I want to check how many of the graphics are not in use and can theoretically be deleted.
So first I used the NOT IN clause:
SELECT [GraphicNr]
,[Graphicfile]
FROM [dbo].[Graphic]
WHERE graphicnr NOT IN (SELECT graphicnr FROM dbo.Komp)
Which gave zero results, which seemed weird to me. After rewriting it to a NOT EXISTS, I got about 600 results:
SELECT [GraphicNr]
,[Graphicfile]
FROM [dbo].[Graphic] a
WHERE NOT EXISTS (SELECT graphicnr FROM dbo.komp b WHERE a.GraphicNr = b.GraphicNr)
So I guess I don't really have a problem, since the second statement works, but to my understanding, shouldn't the first one give the same results?
NOT IN with a subquery has strange behavior. If any row in the subquery returns a NULL value, then no rows are returned. This is due to following the strict semantics of NULL (which means: "I don't know if they are equal").
NOT EXISTS behaves as you would expect. For this reason, I recommend never using NOT IN with a subquery. Always use NOT EXISTS.
That because of NULL value returned from subquery :
SELECT [GraphicNr], [Graphicfile]
FROM [dbo].[Graphic]
WHERE graphicnr NOT IN (SELECT graphicnr FROM dbo.Komp)
This would produce no records or no rows affected because of graphicnr not in (null) which is not desired output.
So, the NOT EXISTS would not work as the way the IN clause or NOT IN work. It behaves differently then IN or NOT IN clause.
However, you can prevent this by using IS NOT NULL filter in subquery. But the recommended way is to use NOT EXISTS instead.
This query produces the expected result:
SELECT *
FROM (SELECT 1 UNION ALL SELECT 2) AS tbl(col)
WHERE col IN (NULL, 1)
-- returns first row
But adding a NOT does not invert the results:
SELECT *
FROM (SELECT 1 UNION ALL SELECT 2) AS tbl(col)
WHERE NOT col IN (NULL, 1)
-- returns zero rows
This is because the above query is roughly equivalent to the following:
SELECT *
FROM (SELECT 1 UNION ALL SELECT 2) AS tbl(col)
WHERE NOT (col = NULL OR col = 1)
And here is how the where clause is evaluated:
| col | col = NULL (1) | col = 1 | col = NULL OR col = 1 | NOT (col = NULL OR col = 1) |
|-----|----------------|---------|-----------------------|-----------------------------|
| 1 | UNKNOWN | TRUE | TRUE | FALSE |
| 2 | UNKNOWN | FALSE | UNKNOWN (2) | UNKNOWN (3) |
Notice that:
- The comparison involving
NULLyieldsUNKNOWN - The
ORexpression where none of the operands areTRUEand at least one operand isUNKNOWNyieldsUNKNOWN(ref) - The
NOTofUNKNOWNyieldsUNKNOWN(ref)
You can extend the above example to more than two values (e.g. NULL, 1 and 2) but the result will be same: if one of the values is NULL then no row will match.
TLDR: it basically boils down to the three-valued logic used in SQL. To conquer SQL you need to master it. If you cannot, just follow the use NOT EXISTS suggestion.
来源:https://stackoverflow.com/questions/52039152/not-in-does-not-produce-same-results-as-not-exists