Strategy to improve Oracle DELETE performance

问题

We've got an Oracle 11g installation that is starting to get big. This database is the backend to a parallel optimization system running on a cluster. Input to the process is contained in the database along with output from the optimization steps. The input includes rote configuration data and some binary files (using 11g's SecureFiles). The output includes 1D, 2D, 3D, and 4D data currently stored in the DB.

DB Structure:

/* Metadata tables */
Case(CaseId, DeleteFlag, ...) On Delete Cascade CaseId
OptimizationRun(OptId, CaseId, ...) On Delete Cascade OptId
OptimizationStep(StepId, OptId, ...) On Delete Cascade StepId

/* Data tables */
Files(FileId, CaseId, Blob) /* deletes are near instantateous here */

/* Data per run */
OnedDataX(OptId, ...)
TwoDDataY1(OptId, ...) /* packed representation of a 1D slice */

/* Data not only per run, but per step */
TwoDDataY2(StepId, ...)  /* packed representation of a 1D slice */
ThreeDDataZ(StepId, ...) /* packed representation of a 2D slice */
FourDDataZ(StepId, ...)  /* packed representation of a 3D slice */
/* ... About 10 or so of these tables exist */

A reaper script comes around daily and looks for cases with the DeleteFlag = 1 and proceeds with the DELETE FROM Case WHERE DeleteFlag = 1, allowing the cascades to continue.

This strategy works great for read/write, but is now outstripping our capabilities when we want to purge data! The rub is deleting a Case takes ~20-40 minutes depending on the size and often overloads our archiver space. The next major version of the product will take a "from the ground up" approach to solving the problem. The next minor release needs to stay within the confines of data stored in the database.

So, for the minor release we need an approach that can improve delete performance and at most require moderate changes to the database.

REF Partitioning, but the question is HOW? I would love to do INTERVAL on Case and REF on the rest, but that isn't supported. Is there some way to manually partition OptimizationRun by CaseId through a trigger?
Disable archiving/redo logs for deletes? Couldn't find a HINT to go with this one. Not sure it is even feasible.
~~Truncate? This likely would need some sorta complicated table setup. But maybe I'm not considering all of my option.~~ (per answer, stricken)

To help illustrate the issue, the data in question per case ranges from 15MiB to 1.5GiB with anywhere from 20k to 2M rows.

Update: Current size of the DB is ~1.5TB.

回答1:

Deleting data is a hell of a job, for the database. It has to create before images, update indexes, write redo logs and remove the data. This is a slow process. If you can have a window to perform this task, easiest and fastest is to build new tables, containing the wanted data. Drop the old tables and rename the new tables. This requires some setup work, that is obvious but is very well possible to make. One step less drastic is to drop the indexes before the delete takes place. My vote would go for CTAS (Create Table As Select from) and build the new tables. A nice partitioning schema would certainly be helpful, maybe in the next release Oracle can combine interval and reference partitioning. It would be very nice to have.

Disabling logging .... can not be done for deletes but CTAS can use nologging. Make a backup when ready and make sure to transfer the datafiles to the standby database, if you have one.

回答2:

Just some thoughts:

I assume you have indexes on all foreign keys. ON DELETE CASCADE will hold row level locks until the Case delete is complete, and with no indexes will hold table locks I believe and be super slow of course
Do you have any deferred constraints? This would most likely slow things down for Oracle cascading through the various table deletes
Have you tried to do the deletes separately for all affected tables (instead of relying on on delete cascade)? Not as easy, but you may be surprised.

EDIT:

One more thought. You may consider doing a SOFT delete on Case table, meaning you have a status field that will tell your app if that Case should be considered. This flag could have many different values, but maybe 'A' for active and 'I' for inactive. Assuming you are always using Case as a driving/primary table in joins to other tables, you can avoid the HARD deletes all-together (and occasionally do a cleanup off hours on whatever schedule if you like). Apps would need to be aware of this flag of course, and you'd be tied to joining back to Case table. May or may not fit for your situation...

回答3:

CASCADE DELETE runs internally slow-by-slow, er, row-by-row.

Some options:

Have your purge job snapshot all the cases to be purged into a scratch table with a CTAS. Then have your purge job loop over that table, deleting each case (and its children) individually. This can be unpleasant, especially if you run into millions of descendant rows. We had to change one of the processes recently at [business redacted] which did that to determine which ultimate parents had child counts that would be problematic, and then use a rownum limiter on a delete against the problematic child table(s). It's not fast, but at least it's safer from an undo/redo management perspective by placing an upper bound on how big any transaction can be.
If you're using CASCADE DELETE as a convenience, you could always not do so. You'd have to write a more sophisticated purge routine that deletes from your dependency tree "bottom up".
If you can afford the undo/redo generation on the soft delete, you could range-partition the ultimate parent on DeleteFlag, then partition the children BY REFERENCE, all tables using ENABLE ROW MOVEMENT. You'd incur undo/redo costs for moving the rows when soft-deleted, but when it came time to finally purge, it would be truncating partitions where DeleteFlag = 1, nothing more.
Adding storage is relatively cheap. If there's a date-based retention option, use it, and just have the soft delete option hide the data from the application front end. It's inelegant, but then, so is CASCADE DELETE.

回答4:

Use Enterprise Manager to create a AWR report and run it through statspack analyzer which will give you detailed instructions about the bottlenecks in your system. A AWR report is a textfile containing all kinds of data about what the database has done during a certain time and how long it took.... That statspack analyzer ist sort of an automatic DBA telling you what to do.

Forget partitions until Statspack Analyzer tells you that they could be useful and you've got a few idle disks that you can use to distribute the I/O.

Don't think about truncate. It forces a commit...

BTW, I'm not affiliated with Statspack Analyzer, but I think it's a very viable general tuning approach for Oracle, especially if there's no DBA around.

回答5:

Not advised for live database.

I disabled the foreign key constraints referencing the table which is slow to delete.
I executed the delete
Enabled the foreign keys again.

来源：https://stackoverflow.com/questions/5792425/strategy-to-improve-oracle-delete-performance

标签

Oracle

oracle11g