we want to speed up the run of the parallel insert statement below. We are expecting to insert around 80M records and it is taking around 2 hours to finish.
I can see 2 big problems:
1 - hint parallel (in select) NO NOT work, beacuse it should be like this +PARALLEL(T1,16)
2 - SELECT DO NOT optimal, it would be better if avoid expression NOT IN
Improve statistics. The estimated number of rows is 1, but the actual number of rows is over 7 million and counting. This causes the execution plan to use a nested loop instead of a hash join. A nested loop works better for small amounts of data and a hash join works better for large amounts of data. Fixing that may be as easy as ensuring the relevant tables have accurate, current statistics. This can usually be done by gathering statistics with the default settings, for example: exec dbms_stats.gather_table_stats('SIRS_UATC1', 'TBL_RECON_PM');
.
If that doesn't improve the cardinality estimate try using a dynamic sampling hint, such as /*+ dynamic_sampling(5) */
. For such a long-running query it is worth spending a little extra time up-front sampling data if it leads to a better plan.
Use statement-level parallelism instead of object-level parallelism. This is probably the most common mistake with parallel SQL. If you use object-level parallelism the hint must reference the alias of the object. Since 11gR2 there is no need to worry about specifying objects. This statement only needs a single hint: INSERT /*+ PARALLEL(16) APPEND */ ...
. Note that NOLOGGING
is not a real hint.
Try using more bind variables, especially where nested loops might happen. I've noticed that you can use it in cases like
CREATE_DT >= :YOUR_DATE instead of CREATE_DT >= sysdate - 60
I think this would explain why you have 180 million executions in the lowest part of your execution plan even though the whole other part of the update query is still at 8 million out of your 79 million.