SQL优化之子查询-IN和EXISTS哪个快点

匿名 (未验证) 提交于 2019-12-02 23:43:01

对SQL优化,有一定理解的人都会知道,SQL优化的核心是减少物理IO的次数,说的通俗点,我们要尽量减少表的扫描次数,这里的表主要是大表。

今天说的子查询,我们可以理解为SQL包含IN, NOT IN, EXISTS, NOT EXISTS的语句, 以前经常有人会问IN和EXISTS到底怎么选,也有说EXISTS的性能更好,或者根据内表和外表的数据量来选择IN和EXISTS,其实在我看来,这些比较片面。哪个好,我们还是要看执行计划和执行时间。

当SQL中含有IN, NOT IN, EXISTS, NOT EXISTS的时候,优化器会尝试改写,为什么要改写呢? 因为这些东西会导致一种叫Filter的东西,是不是很熟悉,前面的文章写到过哦,不知道的可以翻翻我前面的文章。

下面看看例子:

SQL> explain plan for

select EMPNO,ENAME

from emp

where exists

(select DEPTNO from dept

where

emp.DEPTNO = dept.DEPTNO

and dept.loc = ‘BOSTON’

union all

select DEPTNO from dept

where

emp.DEPTNO = dept.DEPTNO

and dept.loc = ‘DALLAS’

);

2 3 4 5 6 7 8 9 10 11 12 13 14

Explained.

SQL> set linesize 400

SELECT * FROM TABLE(dbms_xplan.display);SQL>

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 1230022885

-----------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-----------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 5 | 65 | 9 (0)| 00:00:01 |

|* 1 | FILTER | | | | | |

| 2 | TABLE ACCESS FULL | EMP | 14 | 182 | 3 (0)| 00:00:01 |

| 3 | UNION-ALL | | | | | |

|* 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 11 | 1 (0)| 00:00:01 |

|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

|* 6 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 11 | 1 (0)| 00:00:01 |

|* 7 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |

-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter( EXISTS ( (SELECT “DEPTNO” FROM “DEPT” “DEPT” WHERE

“DEPT”.“DEPTNO”=:B1 AND “DEPT”.“LOC”=‘BOSTON’) UNION ALL (SELECT “DEPTNO” FROM

“DEPT” “DEPT” WHERE “DEPT”.“DEPTNO”=:B2 AND “DEPT”.“LOC”=‘DALLAS’)))

4 - filter(“DEPT”.“LOC”=‘BOSTON’)

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

5 - access(“DEPT”.“DEPTNO”=:B1)

6 - filter(“DEPT”.“LOC”=‘DALLAS’)

7 - access(“DEPT”.“DEPTNO”=:B1)

SQL> explain plan for

select EMPNO,ENAME

from emp

where DEPTNO in

(select DEPTNO from dept

where

dept.loc = ‘BOSTON’

union all

select DEPTNO from dept

where

dept.loc = ‘DALLAS’

); 2 3 4 5 6 7 8 9 10 11 12

Explained.

SQL> set linesize 400

SELECT * FROM TABLE(dbms_xplan.display);SQL>

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 1230022885

-----------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-----------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 9 | 117 | 9 (0)| 00:00:01 |

|* 1 | FILTER | | | | | |

| 2 | TABLE ACCESS FULL | EMP | 14 | 182 | 3 (0)| 00:00:01 |

| 3 | UNION-ALL | | | | | |

|* 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 11 | 1 (0)| 00:00:01 |

|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

|* 6 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 11 | 1 (0)| 00:00:01 |

|* 7 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |

-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter( EXISTS ( (SELECT “DEPTNO” FROM “DEPT” “DEPT” WHERE “DEPTNO”=:B1

AND “DEPT”.“LOC”=‘BOSTON’) UNION ALL (SELECT “DEPTNO” FROM “DEPT” “DEPT” WHERE

“DEPTNO”=:B2 AND “DEPT”.“LOC”=‘DALLAS’)))

4 - filter(“DEPT”.“LOC”=‘BOSTON’)

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

5 - access(“DEPTNO”=:B1)

6 - filter(“DEPT”.“LOC”=‘DALLAS’)

7 - access(“DEPTNO”=:B1)

这种写法熟悉了exist/in (xxx union all yyyy),肯定有人写过这样的SQL,我也不止一次看到这样的写法,但是估计大家都没有执行执行计划,因为如果数据量小的时候,什么都无所谓,你SQL在烂也没事,但是我们要考虑的是未来,是future. 数据量大了之后,我们会发现,SQL跑不动了。

那我们来分析下执行计划,看到ID=1的地方的FILTER(:B1)了吗,注意他,注意他,注意他,重要的事情说三遍。

1 - filter( EXISTS ( (SELECT “DEPTNO” FROM “DEPT” “DEPT” WHERE

“DEPT”.“DEPTNO”=:B1 AND “DEPT”.“LOC”=‘BOSTON’) UNION ALL (SELECT “DEPTNO” FROM

“DEPT” “DEPT” WHERE “DEPT”.“DEPTNO”=:B2 AND “DEPT”.“LOC”=‘DALLAS’)))

还在要对于的这段说明什么呢?

说明emp通过deptno连接键把值传给了下面的dept表来扫描,假如emp表的deptno是唯一的,没有重复,假如有1000W行deptno,我们可以想下下面这段要执行多少次呢? dept表要被扫描多少次呢? 1000W次,如果

dept大小是10G, 那要扫描1000W次*10G,你的SQL还会跑的出来吗?

别想了 cancel吧。

这种写法建议改写,我们先说第一种,就是把union all分开,把整个SQL分成2段来写,具体怎么写,大家自己动动手,试试,看看效果怎么样。

下面我们试试IN怎么样。 不是很多人说IN性能很差吗?试试就知道

SQL> explain plan for

select EMPNO,ENAME

from emp

where exists

(select DEPTNO from dept

where

emp.DEPTNO = dept.DEPTNO

and dept.loc = ‘BOSTON’

and rownum<=1

);

2 3 4 5 6 7 8 9 10

Explained.

SQL> set linesize 400

SELECT * FROM TABLE(dbms_xplan.display);

SQL>

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 3414630506

-----------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

-----------------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 5 | 65 | 6 (0)| 00:00:01 |

|* 1 | FILTER | | | | | |

| 2 | TABLE ACCESS FULL | EMP | 14 | 182 | 3 (0)| 00:00:01 |

|* 3 | COUNT STOPKEY | | | | | |

|* 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 11 | 1 (0)| 00:00:01 |

|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter( EXISTS (SELECT 0 FROM “DEPT” “DEPT” WHERE ROWNUM<=1 AND

“DEPT”.“DEPTNO”=:B1 AND “DEPT”.“LOC”=‘BOSTON’))

3 - filter(ROWNUM<=1)

4 - filter(“DEPT”.“LOC”=‘BOSTON’)

5 - access(“DEPT”.“DEPTNO”=:B1)

21 rows selected.

SQL>

explain plan for

select EMPNO,ENAME

from emp

where DEPTNO in

(select DEPTNO from dept

where

dept.loc = ‘BOSTON’

and rownum<=1

);

SQL> 2 3 4 5 6 7 8 9

Explained.

SQL> set linesize 400

SELECT * FROM TABLE(dbms_xplan.display); SQL>

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 3841060209

---------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 5 | 130 | 7 (15)| 00:00:01 |

|* 1 | HASH JOIN SEMI | | 5 | 130 | 7 (15)| 00:00:01 |

| 2 | TABLE ACCESS FULL | EMP | 14 | 182 | 3 (0)| 00:00:01 |

| 3 | VIEW | VW_NSO_1 | 1 | 13 | 3 (0)| 00:00:01 |

|* 4 | COUNT STOPKEY | | | | | |

|* 5 | TABLE ACCESS FULL| DEPT | 1 | 11 | 3 (0)| 00:00:01 |

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access(“DEPTNO”=“DEPTNO”)

4 - filter(ROWNUM<=1)

5 - filter(“DEPT”.“LOC”=‘BOSTON’)

19 rows selected.

上面的两段代码分别用了EXISTS和IN,但是奇怪的是EXISTS用了FILTER, IN却用了HASH JOIN,如果在都是2个大表的情况下,无疑是IN的性能更好点。

所以我说EXISTS和IN ,谁好谁坏,不要轻易下定论,一些事实来说话(执行计划和执行时间)。

还有要介绍一对hint, unnest / no_unnest, 就是为了手动干预FILTER的

SQL> explain plan for

select EMPNO,ename

from emp

where exists

(select /+ unnest/ DEPTNO from dept

where

emp.DEPTNO = dept.DEPTNO

and dept.loc = ‘BOSTON’

union

select DEPTNO from dept

where

emp.DEPTNO = dept.DEPTNO

and dept.loc = ‘DALLAS’

);

2 3 4 5 6 7 8 9 10 11 12 13 14

Explained.

SQL> set linesize 400

SELECT * FROM TABLE(dbms_xplan.display);SQL>

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 3446838818

---------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 5 | 130 | 12 (25)| 00:00:01 |

|* 1 | HASH JOIN SEMI | | 5 | 130 | 12 (25)| 00:00:01 |

| 2 | TABLE ACCESS FULL | EMP | 14 | 182 | 3 (0)| 00:00:01 |

| 3 | VIEW | VW_SQ_1 | 2 | 26 | 8 (25)| 00:00:01 |

| 4 | SORT UNIQUE | | 1 | 22 | 8 (63)| 00:00:01 |

| 5 | UNION-ALL | | | | | |

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

|* 6 | TABLE ACCESS FULL| DEPT | 1 | 11 | 3 (0)| 00:00:01 |

|* 7 | TABLE ACCESS FULL| DEPT | 1 | 11 | 3 (0)| 00:00:01 |

---------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access(“EMP”.“DEPTNO”=“VW_COL_1”)

6 - filter(“DEPT”.“LOC”=‘BOSTON’)

7 - filter(“DEPT”.“LOC”=‘DALLAS’)

21 rows selected.

SQL> explain plan for

select EMPNO,ENAME

from emp

where DEPTNO in

(select /+ unnest/ DEPTNO from dept

where

dept.loc = ‘BOSTON’

and rownum<=1

); 2 3 4 5 6 7 8 9

Explained.

SQL> set linesize 400

SELECT * FROM TABLE(dbms_xplan.display); SQL>

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 3841060209

---------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 5 | 130 | 7 (15)| 00:00:01 |

|* 1 | HASH JOIN SEMI | | 5 | 130 | 7 (15)| 00:00:01 |

| 2 | TABLE ACCESS FULL | EMP | 14 | 182 | 3 (0)| 00:00:01 |

| 3 | VIEW | VW_NSO_1 | 1 | 13 | 3 (0)| 00:00:01 |

|* 4 | COUNT STOPKEY | | | | | |

|* 5 | TABLE ACCESS FULL| DEPT | 1 | 11 | 3 (0)| 00:00:01 |

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access(“DEPTNO”=“DEPTNO”)

4 - filter(ROWNUM<=1)

5 - filter(“DEPT”.“LOC”=‘BOSTON’)

19 rows selected.

对于文首说的SQL第二种优化方法,一个hint轻松解决。

说到这里大家肯定是把FILTER当成妖物了,其实不是,任何东西存在即合理,如果FILTER一无是处,ORACLE问什么还保留呢? 只是说大部分情况下,大家要对FILTER各位关注,它是一个容易出问题的地方,具体FILTER在什么情况下可以用,建议看看我前面的标量子查询一文。

下面说下个人意见吧,在多数情况下,我自己更喜欢用IN(oracle里面),因为IN对SQL的执行计划调整更灵活点,当子查询用有 union all , rownum, start with connect by, cube的时候,更容易出现FILTER,因为子查询会被固话,exists会通过连接键把数据传入到内表做Filter.前面文章也说过。

SQL> explain plan for

select EMPNO,ename

from emp

where exists

(select DEPTNO from dept

where

emp.DEPTNO = dept.DEPTNO

and dept.loc = ‘BOSTON’

union

select DEPTNO from dept

where

emp.DEPTNO = dept.DEPTNO

and dept.loc = ‘DALLAS’

); 2 3 4 5 6 7 8 9 10 11 12 13 14

Explained.

SQL> set linesize 400

SELECT * FROM TABLE(dbms_xplan.display);

SQL>

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 3446838818

---------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 5 | 130 | 12 (25)| 00:00:01 |

|* 1 | HASH JOIN SEMI | | 5 | 130 | 12 (25)| 00:00:01 |

| 2 | TABLE ACCESS FULL | EMP | 14 | 182 | 3 (0)| 00:00:01 |

| 3 | VIEW | VW_SQ_1 | 2 | 26 | 8 (25)| 00:00:01 |

| 4 | SORT UNIQUE | | 1 | 22 | 8 (63)| 00:00:01 |

| 5 | UNION-ALL | | | | | |

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

|* 6 | TABLE ACCESS FULL| DEPT | 1 | 11 | 3 (0)| 00:00:01 |

|* 7 | TABLE ACCESS FULL| DEPT | 1 | 11 | 3 (0)| 00:00:01 |

---------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access(“EMP”.“DEPTNO”=“VW_COL_1”)

6 - filter(“DEPT”.“LOC”=‘BOSTON’)

7 - filter(“DEPT”.“LOC”=‘DALLAS’)

这里是我测试的union的案例, exists貌似没有走FILTER, 情况和union all不一样

今天就说到这里

个人意见,望指正

   mobile.xasgnk.cn

   mobile.0411nk.cn

文章来源: https://blog.csdn.net/qq_42894764/article/details/92639090
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!