How to find non-existing data from another Table by JOIN?

久未见 提交于 2019-11-26 21:03:26

An INNER JOIN doesn't help here.

One way to solve this is by using a LEFT JOIN:

SELECT w.* 
FROM TABLE1 W 
LEFT JOIN TABLE2 V ON W.name = V.name
WHERE ISNULL(V.name);

The relational operator you require is semi difference a.k.a. antijoin.

Most SQL products lacks an explicit semi difference operator or keyword. Standard SQL-92 doesn't have one (it has a MATCH (subquery) semijoin predicate but, although tempting to think otherwise, the semantics for NOT MATCH (subquery) are not the same as for semi difference; FWIW the truly relational language Tutorial D successfully uses the NOT MATCHING semi difference).

Semi difference can of course be written using other SQL predicates. The most commonly seen are: outer join with a test for nulls in the WHERE clause, closely followed by EXISTS or IN (subquery). Using EXCEPT (equivalent to MINUS in Oracle) is another possible approach if your SQL product supports it and again depending on the data (specifically, when the headings of the two tables are the same).

Personally, I prefer to use EXISTS in SQL for semi difference join because the join clauses are closer together in the written code and doesn't result in projection over the joined table e.g.

SELECT *
  FROM TABLE1 W
 WHERE NOT EXISTS (
                   SELECT * 
                     FROM TABLE2 V
                    WHERE W.NAME = V.NAME
                  );

As with NOT IN (subquery) (same for the outer join approach), you need to take extra care if the WHERE clause within the subquery involves nulls (hint: if WHERE clause in the subquery evaluates UNKNOWN due to the presence of nulls then it will be coerced to be FALSE by EXISTS, which may yield unexpected results).


UPDATE (3 years on): I've since flipped to preferring NOT IN (subquery) because it is more readable and if you are worried about unexpected results with nulls (and you should be) then stop using them entirely, I did many more years ago.

One way in which it is more readable is there is no requirement for the range variables W and V e.g.

SELECT * FROM TABLE1 WHERE name NOT IN ( SELECT name FROM TABLE2 );
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!