join condition “ON” vs in “WHERE”

橙三吉。 提交于 2019-12-06 05:00:17

Well, what you call "equivalent" is not an equivalent for outer joins. Let's take the left join for example.

Condition in JOIN:

SELECT * FROM Customers c
LEFT JOIN CustomerAccounts ca ON ca.CustomerID = c.CustomerID AND c.State = 'NY'
LEFT JOIN Accounts a ON ca.AccountID = a.AccountID AND a.Status = 1

vs WHERE:

SELECT * FROM Customers c
LEFT JOIN CustomerAccounts ca ON ca.CustomerID = c.CustomerID
LEFT JOIN Accounts a ON ca.AccountID = a.AccountID
WHERE c.State = 'NY'
AND a.Status = 1

Putting the conditions into the WHERE clause effectively makes the joins INNER joins, because the WHERE clause is a row filter that is applied after the joins have been made.

For an inner join, Oracle will choose which conditions to use to join and which to filter based on the cost-based optimiser's analysis. You are likely to see the same execution plan from the first two queries. It won't necessarily join using the on clause and then filter using the where clauses. (It rewrites it to its internal format, the pre-ANSI version, under the hood anyway - which you can see if you trace the query - and there is no distinction in that format).

You can demonstrate that by looking at the explain plan. One interesting demonstration is if you have a foreign key relationship on two columns, and join the parent to the child with one of those related columns in the on and the other in the where.

create table parent (pid1 number, pid2 number,
  constraint parent_pk primary key (pid1, pid2));
create table child (cid number, pid1 number not null, pid2 number not null,
  constraint child_pk primary key (cid),
  constraint child_fk_parent foreign key (pid1, pid2)
    references parent (pid1, pid2));
create index child_fk_index on child (pid1, pid2);

set autotrace on explain
select *
from parent p
join child c on c.pid2 = p.pid2
where c.pid1 = p.pid1;

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |     1 |    65 |     2   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |                |       |       |            |          |
|   2 |   NESTED LOOPS               |                |     1 |    65 |     2   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL         | PARENT         |     1 |    26 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | CHILD_FK_INDEX |     1 |       |     0   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| CHILD          |     1 |    39 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("C"."PID1"="P"."PID1" AND "C"."PID2"="P"."PID2")

The plan shows both columns being used for access, and the index being used.

Oracle doesn't necessarily join in the order you expect - the order of the tables in the from doesn't restrict Oracle's decision on the best plan:

select *
from parent p
join child c on c.pid2 = p.pid2
where c.pid1 = p.pid1
and c.cid = 1;

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |     1 |    65 |     1   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |           |     1 |    65 |     1   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| CHILD     |     1 |    39 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | CHILD_PK  |     1 |       |     1   (0)| 00:00:01 |
|*  4 |   INDEX UNIQUE SCAN          | PARENT_PK |    82 |  2132 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("C"."CID"=1)
   4 - access("C"."PID1"="P"."PID1" AND "C"."PID2"="P"."PID2")

So for inner joins they are equivalent, but it can be useful to separate out the columns that define relationships in the on clauses, e.g. the columns in the keys/indexes you expect it to use; and anything that is just filtering in the where. Oracle might still not do what you expect, but it shows your intent and is somewhat self-documenting.

select *
from child c
join parent p on p.pid1 = c.pid1 and p.pid2 = c.pid2
where c.cid = 1;

... which gets the same execution plan as the previous one, despite appearing quite different:

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |     1 |    65 |     1   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |           |     1 |    65 |     1   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| CHILD     |     1 |    39 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | CHILD_PK  |     1 |       |     1   (0)| 00:00:01 |
|*  4 |   INDEX UNIQUE SCAN          | PARENT_PK |    82 |  2132 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("C"."CID"=1)
   4 - access("P"."PID1"="C"."PID1" AND "P"."PID2"="C"."PID2")

From tracing that and looking in the trace file you can see it's transformed into:

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "C"."CID" "CID","C"."PID1" "PID1","C"."PID2" "PID2","P"."PID1" "PID1",
"P"."PID2" "PID2" FROM "STACKOVERFLOW"."CHILD" "C","STACKOVERFLOW"."PARENT" "P" 
WHERE "C"."CID"=1 AND "P"."PID1"="C"."PID1" AND "P"."PID2"="C"."PID2"

... so internally there is no distinction - all the conditions are in the where clause.

Others have already covered why this doesn't apply for outer joins, but since I mentioned the old format, moving an outer-join condition to the where is roughly the same as omitting the (+) from that condition in the old syntax.

Compare the transformation of these queries; an outer join where both conditions are in the on clause:

select *
from parent p
left outer join child c on c.pid1 = p.pid1 and c.pid2 = p.pid2;

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "P"."PID1" "PID1","P"."PID2" "PID2","C"."CID" "CID","C"."PID1" "PID1",
"C"."PID2" "PID2" FROM "STACKOVERFLOW"."PARENT" "P","STACKOVERFLOW"."CHILD" "C"
WHERE "C"."PID2"(+)="P"."PID2" AND "C"."PID1"(+)="P"."PID1"

... and the 'same' query where one of the conditions has been moved to the where clause:

select *
from parent p
left outer join child c on c.pid1 = p.pid1
where c.pid2 = p.pid2;

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "P"."PID1" "PID1","P"."PID2" "PID2","C"."CID" "CID","C"."PID1" "PID1",
"C"."PID2" "PID2" FROM "STACKOVERFLOW"."PARENT" "P","STACKOVERFLOW"."CHILD" "C"
WHERE "C"."PID2"="P"."PID2" AND "C"."PID1"="P"."PID1"

Notice that the first query has both conditions marked with (+), while the second has neither. The details in the trace show its decisions about (outer) join elimination:

OJE: Begin: find best directive for query block SEL$58A6D7F6 (#0)
OJE: Considering outer-join elimination on query block SEL$58A6D7F6 (#0)
OJE: considering predicate"C"."PID1"(+)="P"."PID1"

rejected
OJE:   outer-join not eliminated
OJE: End: finding best directive for query block SEL$58A6D7F6 (#0)
...
OJE: Begin: find best directive for query block SEL$9E43CB6E (#0)
OJE: Considering outer-join elimination on query block SEL$9E43CB6E (#0)
OJE: considering predicate"C"."PID2"="P"."PID2"

OJE:      Converting outer join of CHILD and PARENT to inner-join.
considered
OJE: considering predicate"C"."PID1"="P"."PID1"

rejected
Registered qb: SEL$AE545566 0x2d07c338 (OUTER-JOIN REMOVED FROM QUERY BLOCK
SEL$9E43CB6E; SEL$9E43CB6E; "C"@"SEL$1")

The outer-join query has become the same as this inner-join:

select *
from parent p
inner join child c on c.pid1 = p.pid1
where c.pid2 = p.pid2;

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "P"."PID1" "PID1","P"."PID2" "PID2","C"."CID" "CID","C"."PID1" "PID1",
"C"."PID2" "PID2" FROM "STACKOVERFLOW"."PARENT" "P","STACKOVERFLOW"."CHILD" "C"
WHERE "C"."PID2"="P"."PID2" AND "C"."PID1"="P"."PID1"

Any conditions on the right side table (first in a left join) can be placed either in the join on in the where clause. The same goes for all conditions for an inner join.

Any conditions on the left side table (first in a right join, second in a left join) have to be put in the on clause. If you put the condition in the where clause, you are effectively turning the outer join into an inner join.


Your examples for left join are not equivalent. In the second one you have a condition for the left side table in the where clause (a.Status = 1), so that will work as an inner join.

Your examples for right join are not equivalent. In the second one you have a condition for the left side table in the where clause (c.State = 'NY'), so that will work as an inner join.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!