join condition “ON” vs in “WHERE”

 SELECT *
 FROM Customers c
 INNER JOIN CustomerAccounts ca
 ON ca.CustomerID = c.CustomerID
 AND c.State = 'NY'
 INNER JOIN Accounts a
 ON ca.AccountID = a.AccountID
 AND a.Status = 1

Equivalent:

 SELECT *
 FROM Customers c
 INNER JOIN CustomerAccounts ca
 ON ca.CustomerID = c.CustomerID
 INNER JOIN Accounts a
 ON ca.AccountID = a.AccountID
 WHERE c.State = 'NY'
 AND a.Status = 1

Left Join:

 SELECT *
 FROM Customers c
 LEFT JOIN CustomerAccounts ca
 ON ca.CustomerID = c.CustomerID
 AND c.State = 'NY'
 LEFT JOIN Accounts a
 ON ca.AccountID = a.AccountID
 AND a.Status = 1

Equivalent:

 SELECT *
 FROM Customers c
 LEFT JOIN CustomerAccounts ca
 ON ca.CustomerID = c.CustomerID
 LEFT JOIN Accounts a
 ON ca.AccountID = a.AccountID
 WHERE c.State = 'NY'
 AND a.Status = 1

Right Join

 SELECT *
 FROM Customers c
 RIGHT JOIN CustomerAccounts ca
 ON ca.CustomerID = c.CustomerID
 AND c.State = 'NY'
 RIGHT JOIN Accounts a
 ON ca.AccountID = a.AccountID
 AND a.Status = 1

Equivalent:

 SELECT *
 FROM Customers c
 RIGHT JOIN CustomerAccounts ca
 ON ca.CustomerID = c.CustomerID
 RIGHT JOIN Accounts a
 ON ca.AccountID = a.AccountID
 WHERE c.State = 'NY'
 AND a.Status = 1

What difference it makes when we specify the join condition in "WHERE" clause vs "ON join condition"?

Do we get same results in inner, left outer, right outer join's by specifying the join conditions in "ON" clause vs in "WHERE" clause. Please advise.

Well, what you call "equivalent" is not an equivalent for outer joins. Let's take the left join for example.

Condition in JOIN:

SELECT * FROM Customers c
LEFT JOIN CustomerAccounts ca ON ca.CustomerID = c.CustomerID AND c.State = 'NY'
LEFT JOIN Accounts a ON ca.AccountID = a.AccountID AND a.Status = 1

vs WHERE:

SELECT * FROM Customers c
LEFT JOIN CustomerAccounts ca ON ca.CustomerID = c.CustomerID
LEFT JOIN Accounts a ON ca.AccountID = a.AccountID
WHERE c.State = 'NY'
AND a.Status = 1

Putting the conditions into the WHERE clause effectively makes the joins INNER joins, because the WHERE clause is a row filter that is applied after the joins have been made.

For an inner join, Oracle will choose which conditions to use to join and which to filter based on the cost-based optimiser's analysis. You are likely to see the same execution plan from the first two queries. It won't necessarily join using the on clause and then filter using the where clauses. (It rewrites it to its internal format, the pre-ANSI version, under the hood anyway - which you can see if you trace the query - and there is no distinction in that format).

You can demonstrate that by looking at the explain plan. One interesting demonstration is if you have a foreign key relationship on two columns, and join the parent to the child with one of those related columns in the on and the other in the where.

create table parent (pid1 number, pid2 number,
  constraint parent_pk primary key (pid1, pid2));
create table child (cid number, pid1 number not null, pid2 number not null,
  constraint child_pk primary key (cid),
  constraint child_fk_parent foreign key (pid1, pid2)
    references parent (pid1, pid2));
create index child_fk_index on child (pid1, pid2);

set autotrace on explain
select *
from parent p
join child c on c.pid2 = p.pid2
where c.pid1 = p.pid1;

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |     1 |    65 |     2   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |                |       |       |            |          |
|   2 |   NESTED LOOPS               |                |     1 |    65 |     2   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL         | PARENT         |     1 |    26 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | CHILD_FK_INDEX |     1 |       |     0   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| CHILD          |     1 |    39 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("C"."PID1"="P"."PID1" AND "C"."PID2"="P"."PID2")

The plan shows both columns being used for access, and the index being used.

Oracle doesn't necessarily join in the order you expect - the order of the tables in the from doesn't restrict Oracle's decision on the best plan:

select *
from parent p
join child c on c.pid2 = p.pid2
where c.pid1 = p.pid1
and c.cid = 1;

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |     1 |    65 |     1   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |           |     1 |    65 |     1   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| CHILD     |     1 |    39 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | CHILD_PK  |     1 |       |     1   (0)| 00:00:01 |
|*  4 |   INDEX UNIQUE SCAN          | PARENT_PK |    82 |  2132 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("C"."CID"=1)
   4 - access("C"."PID1"="P"."PID1" AND "C"."PID2"="P"."PID2")

So for inner joins they are equivalent, but it can be useful to separate out the columns that define relationships in the on clauses, e.g. the columns in the keys/indexes you expect it to use; and anything that is just filtering in the where. Oracle might still not do what you expect, but it shows your intent and is somewhat self-documenting.

select *
from child c
join parent p on p.pid1 = c.pid1 and p.pid2 = c.pid2
where c.cid = 1;

... which gets the same execution plan as the previous one, despite appearing quite different:

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |     1 |    65 |     1   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |           |     1 |    65 |     1   (0)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID| CHILD     |     1 |    39 |     1   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | CHILD_PK  |     1 |       |     1   (0)| 00:00:01 |
|*  4 |   INDEX UNIQUE SCAN          | PARENT_PK |    82 |  2132 |     0   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("C"."CID"=1)
   4 - access("P"."PID1"="C"."PID1" AND "P"."PID2"="C"."PID2")

From tracing that and looking in the trace file you can see it's transformed into:

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "C"."CID" "CID","C"."PID1" "PID1","C"."PID2" "PID2","P"."PID1" "PID1",
"P"."PID2" "PID2" FROM "STACKOVERFLOW"."CHILD" "C","STACKOVERFLOW"."PARENT" "P" 
WHERE "C"."CID"=1 AND "P"."PID1"="C"."PID1" AND "P"."PID2"="C"."PID2"

... so internally there is no distinction - all the conditions are in the where clause.

Others have already covered why this doesn't apply for outer joins, but since I mentioned the old format, moving an outer-join condition to the where is roughly the same as omitting the (+) from that condition in the old syntax.

Compare the transformation of these queries; an outer join where both conditions are in the on clause:

select *
from parent p
left outer join child c on c.pid1 = p.pid1 and c.pid2 = p.pid2;

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "P"."PID1" "PID1","P"."PID2" "PID2","C"."CID" "CID","C"."PID1" "PID1",
"C"."PID2" "PID2" FROM "STACKOVERFLOW"."PARENT" "P","STACKOVERFLOW"."CHILD" "C"
WHERE "C"."PID2"(+)="P"."PID2" AND "C"."PID1"(+)="P"."PID1"

... and the 'same' query where one of the conditions has been moved to the where clause:

select *
from parent p
left outer join child c on c.pid1 = p.pid1
where c.pid2 = p.pid2;

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "P"."PID1" "PID1","P"."PID2" "PID2","C"."CID" "CID","C"."PID1" "PID1",
"C"."PID2" "PID2" FROM "STACKOVERFLOW"."PARENT" "P","STACKOVERFLOW"."CHILD" "C"
WHERE "C"."PID2"="P"."PID2" AND "C"."PID1"="P"."PID1"

Notice that the first query has both conditions marked with (+), while the second has neither. The details in the trace show its decisions about (outer) join elimination:

OJE: Begin: find best directive for query block SEL$58A6D7F6 (#0)
OJE: Considering outer-join elimination on query block SEL$58A6D7F6 (#0)
OJE: considering predicate"C"."PID1"(+)="P"."PID1"

rejected
OJE:   outer-join not eliminated
OJE: End: finding best directive for query block SEL$58A6D7F6 (#0)
...
OJE: Begin: find best directive for query block SEL$9E43CB6E (#0)
OJE: Considering outer-join elimination on query block SEL$9E43CB6E (#0)
OJE: considering predicate"C"."PID2"="P"."PID2"

OJE:      Converting outer join of CHILD and PARENT to inner-join.
considered
OJE: considering predicate"C"."PID1"="P"."PID1"

rejected
Registered qb: SEL$AE545566 0x2d07c338 (OUTER-JOIN REMOVED FROM QUERY BLOCK
SEL$9E43CB6E; SEL$9E43CB6E; "C"@"SEL$1")

The outer-join query has become the same as this inner-join:

select *
from parent p
inner join child c on c.pid1 = p.pid1
where c.pid2 = p.pid2;

Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "P"."PID1" "PID1","P"."PID2" "PID2","C"."CID" "CID","C"."PID1" "PID1",
"C"."PID2" "PID2" FROM "STACKOVERFLOW"."PARENT" "P","STACKOVERFLOW"."CHILD" "C"
WHERE "C"."PID2"="P"."PID2" AND "C"."PID1"="P"."PID1"

Any conditions on the right side table (first in a left join) can be placed either in the join on in the where clause. The same goes for all conditions for an inner join.

Any conditions on the left side table (first in a right join, second in a left join) have to be put in the on clause. If you put the condition in the where clause, you are effectively turning the outer join into an inner join.

Your examples for left join are not equivalent. In the second one you have a condition for the left side table in the where clause (a.Status = 1), so that will work as an inner join.

Your examples for right join are not equivalent. In the second one you have a condition for the left side table in the where clause (c.State = 'NY'), so that will work as an inner join.

来源：https://stackoverflow.com/questions/22044616/join-condition-on-vs-in-where

标签

sql

join

oracle11g