Is it possible that LEFT JOIN fails while subquery with NOT IN clause suceeds?

﹥>﹥吖頭↗ 提交于 2020-06-18 09:10:14

问题


A while I have posted an answer to this question PostgreSQL multiple criteria statement.

Task was quite simple - select values from one table if there is no corresponding value in another table. Assuming we have tables like below:

CREATE TABLE first  (foo numeric);
CREATE TABLE second (foo numeric);

we would like to get all the values from first.foo which doesn’t occur in the second.foo. I've proposed two solutions:

  • using LEFT JOIN
SELECT first.foo
FROM first
LEFT JOIN  second 
ON first.foo = second.foo
WHERE second.foo IS NULL;
  • combining subquery and IN operator:
SELECT first.foo 
FROM first
WHERE first.foo NOT IN (
  SELECT second.foo FROM second
);

For some reason the first wouldn't work (returned 0 rows) in the context of the OP and it has been bugging me since then. I've tried to reproduce that issue using different versions of PostgreSQL but no luck so far.

Is there any particular reason why the first solution would fail and the second worked as expected? Am I missing something obvious?

Here is sqlfiddle but it seems to work on any available platform.

Edit

Like @bma and @MostyMostacho pointed out in the comments it should be rather second one that returned no results (sqlfiddle).


回答1:


As per your sql fiddle, your NOT IN query fails to return results because of the NULL in the second table.

The problem is that NULL means "UNKNOWN" and therefore we cannot say that the following expression is true: 10 not in (5, null).

The reason is what happens when 10 = NULL is compared. We get a NULL back, not a true. This means that a NULL in the NOT IN clause means that no rows will ever pass.

To get the second one to perform the way you expect you have a relatively convoluted query:

SELECT first.foo 
FROM first
WHERE (first.foo  IN (
  SELECT second.foo FROM second
) IS NOT TRUE);

This will properly handle the NULL comparisons, but the join syntax is probably cleaner.




回答2:


select values from one table if there is no corresponding value in another table. You just answered your own question:

SELECT o.value
FROM table_one o
WHERE NOT EXISTS (
    SELECT *
    FROM table_two t
    WHERE t.value = o.value
    );

A short demonstration:

CREATE TABLE first  (foo numeric);
CREATE TABLE second (foo numeric);

INSERT INTO first VALUES (1);
INSERT INTO first VALUES (2);
INSERT INTO first VALUES (3);
INSERT INTO first VALUES (4);
INSERT INTO first VALUES (5);
INSERT INTO first VALUES (NULL); -- added this for completeness

INSERT INTO second VALUES (1);
INSERT INTO second VALUES (3);
INSERT INTO second VALUES (NULL);


SELECT f.foo AS ffoo, s.foo AS sfoo
        -- these expressions all yield boolean values
        , (f.foo = s.foo)                                               AS is_equal
        , (f.foo IN (SELECT foo FROM second))                           AS is_in
        , (f.foo NOT IN (SELECT foo FROM second))                       AS is_not_in
        , (EXISTS (SELECT * FROM second x WHERE x.foo = f.foo))         AS does_exist
        , (NOT EXISTS (SELECT * FROM second x WHERE x.foo = f.foo))     AS does_not_exist
        , (EXISTS (SELECT * FROM first x LEFT JOIN second y ON x.foo = y.foo
                WHERE x.foo = f.foo AND y.foo IS NULL))
                                                                        AS left_join_is_null
FROM first f
FULL JOIN second s ON (f.foo = s.foo AND (f.foo IS NOT NULL OR s.foo IS NOT NULL) )
        ;

Result:

CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1
 ffoo | sfoo | is_equal | is_in | is_not_in | does_exist | does_not_exist | left_join_is_null 
------+------+----------+-------+-----------+------------+----------------+-------------------
    1 |    1 | t        | t     | f         | t          | f              | f
    2 |      |          |       |           | f          | t              | t
    3 |    3 | t        | t     | f         | t          | f              | f
    4 |      |          |       |           | f          | t              | t
    5 |      |          |       |           | f          | t              | t
      |      |          |       |           | f          | t              | f
      |      |          |       |           | f          | t              | f
(7 rows)

As you can see, the boolean can be NULL for the IN() and equals cases. It cannot be NULL for the EXISTS() case. To be or not to be. The LEFT JOIN ... WHERE s.foo IS NULL is (almost) equivalent to the NOT EXISTS case, except that it actually includes second.* into the query results (which is not needed, in most cases)



来源:https://stackoverflow.com/questions/19578784/is-it-possible-that-left-join-fails-while-subquery-with-not-in-clause-suceeds

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!