How Do I Deep Copy a Set of Data, and Change FK References to Point to All the Copies?

后端 未结 2 1203
抹茶落季
抹茶落季 2020-12-17 17:07

Suppose I have Table A and Table B. Table B references Table A. I want to deep copy a set of rows in Table A and Table B. I want all of the new Table B rows to reference the

相关标签:
2条回答
  • 2020-12-17 17:50

    Here is an example with three tables that can probably get you started.

    DB schema

    CREATE TABLE users
        (user_id int auto_increment PRIMARY KEY, 
         user_name varchar(32));
    CREATE TABLE agenda
        (agenda_id int auto_increment PRIMARY KEY, 
         `user_id` int, `agenda_name` varchar(7));
    CREATE TABLE events
        (event_id int auto_increment PRIMARY KEY, 
         `agenda_id` int, 
         `event_name` varchar(8));
    

    An SP to clone a user with his agenda and events records

    DELIMITER $$
    CREATE PROCEDURE clone_user(IN uid INT)
    BEGIN
        DECLARE last_user_id INT DEFAULT 0;
    
        INSERT INTO users (user_name)
        SELECT user_name
          FROM users
         WHERE user_id = uid;
    
        SET last_user_id = LAST_INSERT_ID();
    
        INSERT INTO agenda (user_id, agenda_name)
        SELECT last_user_id, agenda_name
          FROM agenda
         WHERE user_id = uid;
    
        INSERT INTO events (agenda_id, event_name)
        SELECT a3.agenda_id_new, e.event_name
          FROM events e JOIN
        (SELECT a1.agenda_id agenda_id_old, 
               a2.agenda_id agenda_id_new
          FROM
        (SELECT agenda_id, @n := @n + 1 n 
           FROM agenda, (SELECT @n := 0) n 
          WHERE user_id = uid 
          ORDER BY agenda_id) a1 JOIN
        (SELECT agenda_id, @m := @m + 1 m 
           FROM agenda, (SELECT @m := 0) m 
          WHERE user_id = last_user_id 
          ORDER BY agenda_id) a2 ON a1.n = a2.m) a3 
             ON e.agenda_id = a3.agenda_id_old;
    END$$
    DELIMITER ;
    

    To clone a user

    CALL clone_user(3);
    

    Here is SQLFiddle demo.

    0 讨论(0)
  • 2020-12-17 18:09

    I recently found myself needing to solve a similar problem; that is, I needed to copy a set of rows in a table (Table A) as well as all of the rows in related tables which have foreign keys pointing to Table A's primary key. I was using Postgres so the exact queries may differ but the overall approach is the same. The biggest benefit of this approach is that it can be used recursively to go infinitely deep

    TLDR: the approach looks like this

    1) find all the related table/columns of Table A
    2) copy the necessary data into temporary tables
    3) create a trigger and function to propagate primary key column 
       updates to related foreign keys columns in the temporary tables
    4) update the primary key column in the temporary tables to the next 
       value in the auto increment sequence
    5) Re-insert the data back into the source tables, and drop the 
       temporary tables/triggers/function
    

    1) The first step is to query the information schema to find all of the tables and columns which are referencing Table A. In Postgres this might look like the following:

    SELECT tc.table_name, kcu.column_name
    FROM information_schema.table_constraints tc
    JOIN information_schema.key_column_usage kcu
    ON tc.constraint_name = kcu.constraint_name
    JOIN information_schema.constraint_column_usage ccu
    ON ccu.constraint_name = tc.constraint_name
    WHERE constraint_type = 'FOREIGN KEY'
    AND ccu.table_name='<Table A>'
    AND ccu.column_name='<Primary Key>'
    

    2) Next we need to copy the data from Table A, and any other tables which reference Table A - lets say there is one called Table B. To start this process, lets create a temporary table for each of these tables and we will populate it with the data that we need to copy. This might look like the following:

    CREATE TEMP TABLE temp_table_a AS (
        SELECT * FROM <Table A> WHERE ...
    )
    
    CREATE TEMP TABLE temp_table_b AS (
        SELECT * FROM <Table B> WHERE <Foreign Key> IN (
            SELECT <Primary Key> FROM temp_table_a
        )
    )
    

    3) We can now define a function that will cascade primary key column updates out to related foreign key columns, and trigger which will execute whenever the primary key column changes. For example:

    CREATE OR REPLACE FUNCTION cascade_temp_table_a_pk()
    RETURNS trigger AS
    $$
    BEGIN
       UPDATE <Temp Table B> SET <Foreign Key> = NEW.<Primary Key>
       WHERE <Foreign Key> = OLD.<Primary Key>;
    
       RETURN NEW;
    END;
    $$ LANGUAGE plpgsql;
    
    CREATE TRIGGER trigger_temp_table_a
    AFTER UPDATE
    ON <Temp Table A>
    FOR EACH ROW
    WHEN (OLD.<Primary Key> != NEW.<Primary Key>)
    EXECUTE PROCEDURE cascade_temp_table_a_pk();
    

    4) Now we just update the primary key column in to the next value of the sequence of the source table (). This will activate the trigger, and the updates will be cascaded out to the foreign key columns in . In Postgres you can do the following:

    UPDATE <Temp Table A>
    SET <Primary Key> = nextval(pg_get_serial_sequence('<Table A>', '<Primary Key>'))
    

    5) Insert the data back from the temporary tables back into the source tables. And then drop the temporary tables, triggers, and functions after that.

    INSERT INTO <Table A> (SELECT * FROM <Temp Table A>)
    INSERT INTO <Table B> (SELECT * FROM <Temp Table B>)
    DROP TRIGGER trigger_temp_table_a
    DROP cascade_temp_table_a_pk()
    

    It is possible to take this general approach and turn it into a script which can be called recursively in order to go infinitely deep. I ended up doing just that using python (our application was using django so I was able to use the django ORM to make some of this easier)

    0 讨论(0)
提交回复
热议问题