How to remove duplicate entries from a mysql db?

后端 未结 8 1367
南旧
南旧 2020-12-02 10:41

I have a table with some ids + titles. I want to make the title column unique, but it has over 600k records already, some of which are duplicates (sometimes several dozen ti

8条回答
  •  一个人的身影
    2020-12-02 10:57

    Since the MySql ALTER IGNORE TABLE has been deprecated, you need to actually delete the duplicate date before adding an index.

    First write a query that finds all the duplicates. Here I'm assuming that email is the field that contains duplicates.

    SELECT
        s1.email
        s1.id, 
        s1.created
        s2.id,
        s2.created 
    FROM 
        student AS s1 
    INNER JOIN 
        student AS s2 
    WHERE 
        /* Emails are the same */
        s1.email = s2.email AND
        /* DON'T select both accounts,
           only select the one created later.
           The serial id could also be used here */
        s2.created > s1.created 
    ;
    

    Next select only the unique duplicate ids:

    SELECT 
        DISTINCT s2.id
    FROM 
        student AS s1 
    INNER JOIN 
        student AS s2 
    WHERE 
        s1.email = s2.email AND
        s2.created > s1.created 
    ;
    

    Once you are sure that only contains the duplicate ids you want to delete, run the delete. You have to add (SELECT * FROM tblname) so that MySql doesn't complain.

    DELETE FROM
        student 
    WHERE
        id
    IN (
        SELECT 
            DISTINCT s2.id
        FROM 
            (SELECT * FROM student) AS s1 
        INNER JOIN 
            (SELECT * FROM student) AS s2 
        WHERE 
            s1.email = s2.email AND
            s2.created > s1.created 
    );
    

    Then create the unique index:

    ALTER TABLE
        student
    ADD UNIQUE INDEX
        idx_student_unique_email(email)
    ;
    

提交回复
热议问题