updating 1 million records in oracle DB as batch of 10k;

问题

I have to update a table in oracle db as batch of 10k.

I tried this:

BEGIN
        WHILE (true) LOOP
        UPDATE TOP (10000) CUSTOMERS SET ACTIVE = 'N' WHERE ACTIVE='Y';
        IF sql%notfound THEN
                EXIT;
        END IF;
        COMMIT;
        END LOOP;
END;

It doesn't works as plsql doesn't support top.

Any suggestions?

回答1:

your pl/SQL block, using JVA's suggestion, should be completed like this (somce you are new to pl/sql I am adding some syntactical suggestions that you could be interested in):

BEGIN
      -- WHILE (TRUE) can be omitted: "loop ... end loop;" 
      --  already is an endless loop
      LOOP
         UPDATE CUSTOMERS 
             SET ACTIVE = 'N' 
         WHERE ACTIVE='Y'
           AND rownum <= 1000;  
         exit when sql%notfound; -- notice that exit accepts "when condition"
         --IF sql%notfound THEN  -- you can avoid a if/endif by using "exit when"
         --   EXIT;
         -- END IF;
         COMMIT;
   END LOOP;
   commit; -- you missed this commit for the last iteration
END;

Don't be tempted of placing "commit" before the "exit when sql%notfound": after a "commit" sql%notfound is always false and your loop would be really endless.

Let me point out that, in order to be efficient, this approach requires the the "ACTIVE" column to be indexed!

if you don't have an index on the "active" column, each "update" will be forced to restart a full table scan from the beginning just to find the next 1000 records that still need to be updated.

This other approach I am proposing uses some advanced PL/SQL features you, as a learner, mighy be interested in (rowid, "table of",cursor bulk fetches and "forall") and does only one scan of the table to be updated so (in case of absence of indexes) it performs better than the previous approach. keep in mind that if you have indexes, this is slower (but using foralls, bulk collects and rowid accesses, it is not that slower) , but it can get handy in cases where things are more complex (for example: when the where condition needs to access data from other tables using complex joins that can't be made faster). There are cases when the "where" is so complex and slow that you really don't want to re-execute it over and over using a "where rownum<=1000" approach.

    declare
      type rowid_array is table of rowid;

      ids rowid_array;

      cursor cur is 
         select rowid as id 
         from CUSTOMERS 
         where ACTIVE='Y';

    begin
      open cur;
      loop
         fetch cur bulk collect into ids limit 1000;
         exit when ids.count = 0;

         forall c in ids.first .. ids.last 
            update CUSTOMERS set  ACTIVE='N';

         commit;     
      end loop;
    end;

回答2:

UPDATE CUSTOMERS 
SET ACTIVE = 'N' 
WHERE ACTIVE='Y'
  AND ROWNUM <= 10000; -- first 10k rows

回答3:

If you have auto incremented synthetic ID column on CUSTOMERS table then you can simply use a WHERE condition saying

WHERE ID <= 10000

回答4:

Adding on top of jva's answer to process all rows without changing the query for every batch:

Before update:

select count(1) from cust ; --1000000

select distinct active from cust ; -- Y

Update:

Begin
for i in 1..100
loop
update cust  set Active = 'N' where ACTIVE = 'Y' 
and rownum <= 10000;
-- dbms_output.put_line ('i value :  ' || i );
commit;
end loop;
dbms_output.put_line ('All rows updated ' );
end;

Output:

All rows updated 

Statement processed.

3.77 seconds

After :

select distinct active from cust ; -- N

来源：https://stackoverflow.com/questions/32160236/updating-1-million-records-in-oracle-db-as-batch-of-10k

标签

sql

database

Oracle

plsql

sqlplus