Selecting both MIN and MAX From the Table is slower than expected

为君一笑 提交于 2019-12-31 11:33:06

问题


I have a table MYTABLE with a date column SDATE which is the primary key of the table and has a unique index on it.

When I run this query:

SELECT MIN(SDATE) FROM MYTABLE

it gives answer instantly. The same happens for:

SELECT MAX(SDATE) FROM MYTABLE

But, if I query both together:

SELECT MIN(SDATE), MAX(SDATE) FROM MYTABLE

it takes much more time to execute. I analyzed the plans and found when one of min or max is queried, it uses INDEX FULL SCAN(MIN/MAX) but when both are queried at the same time, it does a FULL TABLE SCAN.

why?

Test Data:

version 11g

create table MYTABLE
(
  SDATE  DATE not null,
  CELL   VARCHAR2(10),
  data NUMBER
)
tablespace CHIPS
  pctfree 10
  pctused 40
  initrans 1
  maxtrans 255
  storage
  (
    initial 64K
    minextents 1
    maxextents unlimited
  );

alter table MYTABLE
  add constraint PK_SDATE primary key (SDATE)
  using index 
  tablespace SYSTEM
  pctfree 10
  initrans 2
  maxtrans 255
  storage
  (
    initial 64K
    minextents 1
    maxextents unlimited
  );

Load table:

declare 
  i integer;
begin
  for i in 0 .. 100000 loop
     insert into MYTABLE(sdate, cell, data)
     values(sysdate - i/24, 'T' || i, i);     
     commit;
  end loop;
end;

Gather stats:

begin
  dbms_stats.gather_table_stats(tabname => 'MYTABLE', ownname => 'SYS');
end;

Plan1:

Plan2:


回答1:


The Index Full Scan can only visit one side of the index. When you are doing

SELECT MIN(SDATE), MAX(SDATE) FROM MYTABLE

you are requesting to visit 2 sides. Therefore, if you want both the minimum and the maximum column value, an Index Full Scan is not viable.

A more detailed analyze you can find here.




回答2:


The explain plans are different: a single MIN or MAX will produce a INDEX FULL SCAN (MIN/MAX) whereas when the two are present you will get an INDEX FULL SCAN or a FAST FULL INDEX SCAN.

To understand the difference, we have to look for a description of a FULL INDEX SCAN:

In a full index scan, the database reads the entire index in order.

In other words, if the index is on a VARCHAR2 field, Oracle will fetch the first block of the index that would contain for example all entries that start with the letter "A" and will read block by block all entries alphabetically until the last entry ("A" to "Z"). Oracle can process in this way because the entries are sorted in a binary tree index.

When you see INDEX FULL SCAN (MIN/MAX) in an explain plan, that is the result of an optimization that uses the fact that since the entries are sorted, you can stop after having read the first one if you are only interested by the MIN. If you are interested in the MAX only, Oracle can use the same access path but this time starting by the last entry and reading backwards from "Z" to "A".

As of now, a FULL INDEX SCAN has only one direction (either forward or backward) and can not start from both ends simultaneously, this is why when you ask for both the min and the max, you get a less efficient access method.

As suggested by other answers, if the query needs critical efficiency, you could run your own optimization by searching for the min and the max in two distinct queries.




回答3:


Try not selecting both edges of the index in one query , Accessing the query in a different way like this :

select max_date, min_date
from (select max(sdate) max_date from mytable),
       (select min(sdate) min_date from mytable)

will cause the optimizer to access the index in INDEX_FULL_SCAN(MIN/MAX) in nested loops (in our case , twice).




回答4:


I have to say that I do not see the same behaviour in 11.2

If I set up a test case as follows and updated from 10k to 1m rows in response to Vincent's comment

set linesize 130
set pagesize 0
create table mytable ( sdate date );

Table created.

insert into mytable
 select sysdate - level
   from dual
connect by level <= 1000000;
commit;

1000000 rows created.


Commit complete.

alter table mytable add constraint pk_mytable primary key ( sdate ) using index;

Table altered.

begin
dbms_stats.gather_table_stats( user, 'MYTABLE' 
                             , estimate_percent => 100
                             , cascade => true
                               );
end;
/

PL/SQL procedure successfully completed.

Then, executing your queries I get almost identical looking explain plans (notice the different types of INDEX FULL SCAN)

explain plan for select min(sdate) from mytable;

Explained.

select * from table(dbms_xplan.display);
Plan hash value: 3877058912

-----------------------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |        |     1 |     8 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE        |        |     1 |     8 |        |      |
|   2 |   INDEX FULL SCAN (MIN/MAX)| PK_MYTABLE |     1 |     8 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

9 rows selected.

explain plan for select min(sdate), max(sdate) from mytable;

Explained.

select * from table(dbms_xplan.display);
Plan hash value: 3812733167

-------------------------------------------------------------------------------
| Id  | Operation    | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |     1 |     8 |   252   (0)| 00:00:04 |
|   1 |  SORT AGGREGATE  |        |     1 |     8 |        |          |
|   2 |   INDEX FULL SCAN| PK_MYTABLE |  1000K|  7812K|   252   (0)| 00:00:04 |
-------------------------------------------------------------------------------

9 rows selected.

To quote from a previous answer of mine:

The two most common reasons for a query not using indexes are:

  1. It's quicker to do a full table scan.
  2. Poor statistics.

Unless there's something you're not posting in the question my immediate answer would be that you have not collected statistics on this table, you haven't collected them with a high enough estimate percent or you've used analyze, which will not help the Cost Based Optimizer, unlike dbms_stats.gather_table_stats.

To quote from the documentation on analyze:

For the collection of most statistics, use the DBMS_STATS package, which lets you collect statistics in parallel, collect global statistics for partitioned objects, and fine tune your statistics collection in other ways. See Oracle Database PL/SQL Packages and Types Reference for more information on the DBMS_STATS package.

Use the ANALYZE statement (rather than DBMS_STATS) for statistics collection not related to the cost-based optimizer:



来源:https://stackoverflow.com/questions/12565790/selecting-both-min-and-max-from-the-table-is-slower-than-expected

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!