Optimizing row by row (cursor) processing in Oracle 11g

问题

I have to process a large table (2.5B records) row by row in order to keep track of two variables. As one can imagine, this is quite slow. I am looking for ideas on how to tune this procedure. Thank you.

declare
    cursor c_data is select /* +index(data data_pk) */ * from data order by data_id;
    r_data c_data%ROWTYPE;
    lst_b_prc number(15,8);
    lst_a_prc number(15,8);
begin
    open c_data;
    loop
        fetch c_data into r_data;
        exit when c_data%NOTFOUND;

        if r_data.BATS = 'B' then
            lst_b_prc := r_data.PRC;
        end if;
        if r_data.BATS = 'A' then
            lst_a_prc := r_data.PRC;
        end if;
        if r_data.BATS = 'T' then

          insert into trans .... lst_a_prc , lst_b_prc      
           end if;
    end loop;
    close c_data;
end;

The issue really comes down to finding efficient sql to track the latest PRC value when BATS='A' and BATS='B' for each BATS='T' record.

回答1:

If I understand your problem correctly, with a table of data like this:

create table data as
select 1 data_id, 'T' bats, 1 prc from dual union all
select 2 data_id, 'A' bats, 2 prc from dual union all
select 3 data_id, 'B' bats, 3 prc from dual union all
select 4 data_id, 'T' bats, 4 prc from dual union all
select 5 data_id, 'A' bats, 5 prc from dual union all
select 6 data_id, 'T' bats, 6 prc from dual union all
select 7 data_id, 'B' bats, 7 prc from dual union all
select 8 data_id, 'T' bats, 8 prc from dual union all
select 9 data_id, 'T' bats, 9 prc from dual;

You you want to insert one row for each T, using the last PRC value for A and B. Which would look something like this:

T data_id   Last A   Last B
---------   ------   ------
1           null     null
4           2        3
6           5        3
8           5        7
9           5        7

This query should work:

select data_id, last_A, last_B
from
(
    select data_id, bats, prc
        ,max(case when bats = 'A' then prc else null end) over
            (order by data_id
             rows between unbounded preceding and current row) last_A
        ,max(case when bats = 'B' then prc else null end) over
            (order by data_id
             rows between unbounded preceding and current row) last_B
    from data
)
where bats = 'T';

With so much data, you'll probably want to use direct path writes and parallelism. The performance will largely depend on whether the sorting for the analytic functions can be done in memory or on disk. Optimizing memory can be very difficult, you'll probably need to work with a DBA to allow your process to use as much memory as possible without causing problems for other processes.

回答2:

There are several options. Most importantly, you're probably keeping a huge UNDO/REDO log for all your inserts. You could occasionally commit your work, say every 1000 inserts.

Another option is to use a SQL MERGE statement (or simpler INSERT .. SELECT .. statement), that will allow your Oracle instance to operate on sets rather than on single records. The execution plan of your select might be optimised for optimal INSERT performance.

来源：https://stackoverflow.com/questions/8714967/optimizing-row-by-row-cursor-processing-in-oracle-11g

标签

Oracle

stored-procedures

query-optimization

oracle11g