Deleting duplicates from a large table

|▌冷眼眸甩不掉的悲伤 提交于 2019-11-30 11:49:52
Vinodkumar SC

I think you can use this query to delete the duplicate records from the table

ALTER IGNORE TABLE table_name ADD UNIQUE (location_id, datetime)

Before doing this, just test with some sample data first..and then Try this....

Note: On version 5.5, it works on MyISAM but not InnoDB.

Sjoerd
SELECT *, COUNT(*) AS Count
FROM table
GROUP BY location_id, datetime
HAVING Count > 2
UPDATE table SET datetime  = null 
WHERE location_id IN (
SELECT location_id 
FROM table as tableBis
WHERE tableBis.location_id = table.location_id
AND table.datetime > tableBis.datetime)

SELECT * INTO tableCopyWithNoDuplicate FROM table WHERE datetime is not null

DROp TABLE table 

RENAME tableCopyWithNoDuplicate to table

So you keep the line with the lower datetime. I'm not sure about perf, it depends on your table column, your server etc...

This query works perfectly for every case : tested for Engine : MyIsam for 2 million rows.

ALTER IGNORE TABLE table_name ADD UNIQUE (location_id, datetime)

You can delete duplicates using these steps: 1- Export the following query's results into a txt file:

select dup_col from table1 group by dup_col having count(dup_col) > 1

2- Add this to the first of above txt file and run the final query:

delete from table1 where dup_col in (.....)

Please note that '...' is the contents of txt file created in the first step.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!