I recently found and fixed a bug in a site I was working on that resulted in millions of duplicate rows of data in a table that will be quite large even without them (still
I think the slowness is due to MySQl's "clustered index" where the actual records are stored within the primary key index - in the order of the primary key index. This means access to a record via the primary key is extremely fast because it only requires one disk fetch because the record on the disk right there where it found the correct primary key in the index.
In other databases without clustered indexes the index itself does not hold the record but just an "offset" or "location" indicating where the record is located in the table file and then a second fetch must be made in that file to retrieve the actual data.
You can imagine when deleting a record in a clustered index that all records above that record in the table must be moved downwards to avoid massive holes being created in the index (well that is what I recall from a few years ago at least - later versions may have changed this).
Knowing the above what we found that really sped deletes up in MySQL was to perform the deletes in reverse order. This produces the least amount of record movement because you are delete records from the end first meaning that subsequent deletes have less objects to relocate.
According to the mysql documentation, TRUNCATE TABLE
is a fast alternative to DELETE FROM
. Try this:
TRUNCATE TABLE table_name
I tried this on 50M rows and it was done within two mins.
Note: Truncate operations are not transaction-safe; an error occurs when attempting one in the course of an active transaction or active table lock