Understanding the number of matched rows in LEFT JOIN

北战南征 提交于 2019-12-01 14:50:22

LEFT JOIN means to grab all the rows from the left table, and only if there is a match from the right table, to return those. If there is no match, NULL is returned.

Let's look at what the dataset looks like when joined to itself with no condition. (Note, the asterisks and pluses are referred to below.)

 +-------+------------+-------+------------+
 | t1.id | t1.related | t2.id | t2.related |
 +-------+------------+-------+------------+
 |     1 | NULL       |     1 | NULL       |
+|     1 | NULL       |     2 | 1          |
+|     1 | NULL       |     4 | 1          |
*|     2 | 1          |     1 | NULL       |
 |     2 | 1          |     2 | 1          |
 |     2 | 1          |     4 | 1          |
*|     4 | 1          |     1 | NULL       |
 |     4 | 1          |     2 | 1          |
 |     4 | 1          |     4 | 1          |
 +-------+------------+-------+------------+

The left table is t1 in both cases.

In the first query, we look for matches of t1.related = t2.id (the middle two columns). That corresponds to the rows marked with an asterisk. However, there was no match for t.id = 1, so we need to include this, but with NULL because that's what LEFT JOIN means (no match still returns NULL).

+-------+-------+
| t1.id | t2.id |
+-------+-------+
|     1 | NULL  | added because LEFT JOIN
|     2 |    1  | marked with * in table above
|     4 |    1  | marked with * in table above
+-------+-------+

In the second query, we look for matches of t1.id = t2.related (the outer two columns). That corresponds to the rows marked with a plus. However, there was no match for t.id = 2 and t.id = 4, so we need to include these, but with NULL because that's what LEFT JOIN means (no match still returns NULL).

+-------+-------+
| t1.id | t2.id |
+-------+-------+
|     1 |    2  | marked with + in table above
|     1 |    4  | marked with + in table above
|     2 | NULL  | added because LEFT JOIN
|     4 | NULL  | added because LEFT JOIN
+-------+-------+

First query: t1.related = t2.id

t1              joined t2
id   related  | id   related
--------------+-------------
1    NULL     | --   --
2    1        | 1    NULL
3    1        | 1    NULL

An inner join would result in only two rows, but the outer join also preserves the first row that has no match.

Second query: t1.id = t2.related

t1              joined t2
id   related  | id   related
--------------+-------------
1    NULL     | 2    1
1    NULL     | 3    1
2    1        | --   --
3    1        | --   --

Here too, an inner join would result in only two rows, but the outer join also preserves the two rows that have no match.

In first case, we need to match values 1, 2, 3 with NULL, 1 and 1. Since it is left join, NULL will stay with no match and 1's will be matched with 1 from other table, thus 3 records.

In the second case, we have values 1, 2, 3. 2 and 3 have no match and will result in two rows, but 1 has 2 matches and will result in 2 additional rows, which is 4 rows.

Generally, having:

... LeftTable [LT] left join RightTable [RT] on [LT].[joinCol] = [RT].pjoinCol] ...

will work like this:

take all values from LT.joinCol, try to match with values in RT.joinCol. If some value has n matches in RT.joinCol, then it will result in n rows. If the row has no match, it will still result in one, un-matched record.

In your 1st case, 2 values have 1 match => 1 + 1 = 2 records. One value has no match => 1 record, 2 + 1 = 3.

In your 2nd case, 2 values have no match => thus 2 records, one value has 2 matches => 2 records, 2 + 2 = 4 :)

LEFT JOIN return all tuple from left table even if no matches in right table plus the matched value of right table.

    SELECT t1.id as t1_id, t1.related as t1_r, t2.id as t2_id, t2.related as t2_r
    FROM mytable as t1 
    LEFT JOIN mytable as t2 ON t1.related = t2.id;

returns

 t1_id  t1_r    t2_id   t2_r
 ----------------------------
   1    null    null    null
   2     1        1     null
   4     1        1     null

(1, null) tuple of t1 matches no tuple from t2, (2, 1) tuple of t1 matches one tuple of t2 (1, null), so as (4, 1) hence 3 rows in result

Where as

   SELECT t1.id as t1_id, t1.related as t1_r, t2.id as t2_id, t2.related as t2_r
   FROM test1 as t1 
   LEFT JOIN test1 as t2 ON t1.id  = t2.related ;

returns

   t1_id    t1_r    t2_id   t2_r
   -----------------------------
       1    null       2    1
       1    null       4    1
       2      1     null    null
       4      1     null    null

here (1, null) of t1 matches two tuple of t2 (2, 1) and (4, 1) and (2, 1) and (4, 1) matches no tuple hence 4 rows

Ankit Mongia

The best way to view self join is create two tables and then view the joining conditions.

Table t1

    Id   Related  
    1       null  
    2       1  
    4       1  

Table t2

    Id   Related  
    1    null  
    2    1  
    4    1  

Note: Left Join means every thing from left table will come even if joining condition does not match. From the right table it will come as null.

First Query: t1.related = t2.id; (Columns selected "t1.id, t2.id")

1.) Lets take first row from t1 table and related column has null value. null has not match in id column of t2 table. As it is a left join, row will come from t1 table.

First Row:

   t1_id t2_id  
    1     null  

2.) Lets take second row from t1 table and related column has 1. 1 has one match in id column of t2 table. So one row comes in join condition.

Second Row:

    t1_id t2_id  
    2     1   

3.) Lets take third row from t1 table and related column has 1. 1 has one match in id column of t2 table. So one row comes in join condition.

Third Row:

    t1_id t2_id  
     4    1  

Second Query t1.id = t2.related (Columns selected "t1.id, t2.id")

1.) Lets take first row from t1 table and id column has 1. 1 has 2 rows in related column of t2 table. so two rows are selected.

   t1.id   t2.id  
    1        2  
    1        4

2.) Lets take second row from t1 table and id column has 2. 2 has 0 row in related column of t2 table. But it a left join row will come fro t1 table.

    t1.id   t2.id  
     1       2  
     1       4  
     2      null  

2.) Lets take third row from t1 table and id column has 4. 4 has 0 row in related column of t2 table. But it a left join row will come fro t1 table.

    t1.id   t2.id  
     1        2  
     1        4  
     2       null   
     4       null

Hope this will make you understand.

Thanks Ankit.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!