Deleting Duplicate Records in Oracle based on Maximum Date/Time

问题

i have the following sample data with duplicate information:

ID   Date                 Emp_ID    Name    Keep
---------------------------------------------------------
1    17/11/2010 13:45:22  101       AB      *
2    17/11/2010 13:44:10  101       AB
3    17/11/2010 12:45:22  102       SF      *
4    17/11/2010 12:44:10  102       SF
5    17/11/2010 11:45:22  103       RD      *
6    17/11/2010 11:44:10  103       RD

Based on the above data set, how can I remove the duplicate Emp IDs and only keep the Emp IDs that have the maximum date/time specified?

So based on the above, I would only see IDs: 1, 3 and 5.

Thanks.

回答1:

Something like:

DELETE FROM the_table_with_no_name 
WHERE date_column != (SELECT MAX(t2.date_column) 
                      FROM the_table_with_no_name t2 
                      WHERE t2.id = the_table_with_no_name.id);

回答2:

You can generate the ROWIDs of all rows other than the one with the maximum date (for a given EMPIds) and delete them. I have found this to be performant as it is a set-based approach and uses analytics, rowIDs.

--get list of all the rows to be deleted.

select row_id from (
select rowid row_id,
       row_number() over (partition by emp_id order by date desc) rn
  from <table_name>
) where rn <> 1

And then delete the rows.

delete from table_name where rowid in (
    select row_id from (
    select rowid row_id,
           row_number() over (partition by emp_id order by date desc) rn
      from <table_name>
    ) where rn <> 1
);

来源：https://stackoverflow.com/questions/4194908/deleting-duplicate-records-in-oracle-based-on-maximum-date-time

标签

sql

plsql

oracle10g