Parse & Compare Data using Coldfusion & MySQL

前端 未结 3 1720
情书的邮戳
情书的邮戳 2020-12-12 04:25

First, I\'ll explain what I need to do, then how I think I can achieve it. My current plan seems very inefficient in theory, so my question is whether there is a b

3条回答
  •  自闭症患者
    2020-12-12 05:00

    Both responses have possibilities. Just to expand on your options a little ..

    Option #1

    IF mySQL supports some sort of hashing, on a per row basis, you could use a variation of comodoro's suggestion to avoid hard deletes.

    Identify Changed

    To identify changes, do an inner join on the primary key and check the hash values. If they are different, the product was changed and should be updated:

        UPDATE Products p INNER JOIN Products_Temp tmp ON tmp.ProductID = p.ProductID
        SET    p.ProductName = tmp.ProductName
               , p.Stock = tmp.Stock
               , ...
               , p.DateLastChanged = now()
               , p.IsDiscontinued  = 0
        WHERE  tmp.TheRowHash <> p.TheRowHash
    

    Identify Deleted

    Use a simple outer join to identify records that do not exist in the temp table, and flag them as "deleted"

        UPDATE Products p LEFT JOIN Products_Temp tmp ON tmp.ProductID = p.ProductID
        SET    p.DateLastChanged = now()
               , p.IsDiscontinued = 1
        WHERE  tmp.ProductID IS NULL
    

    Identify New

    Finally, use a similar outer join to insert any "new" products.

        INSERT INTO Products ( ProductName, Stock, DateLastChanged, IsDiscontinued, .. )
        SELECT tmp.ProductName, tmp.Stock, now() AS DateLastChanged, 0 AS IsDiscontinued, ...
        FROM   Products_Temp tmp LEFT JOIN Products p ON tmp.ProductID = p.ProductID
        WHERE  p.ProductID IS NULL
    

    Option #2

    If per row hashing is not feasible, an alternate approach is a variation of Sharondio's suggestion.

    Add a "status" column to the temp table and flag all imported records as "new", "changed" or "unchanged" through a series of joins. (The default should be "changed").

    Identify UN-Changed

    First use an inner join, on all fields, to identify products that have NOT changed. (Note, if your table contains any nullable fields, remember to use something like coalesce Otherwise, the results may be skewed because null values are not equal to anything.

        UPDATE  Products_Temp tmp INNER JOIN Products p ON tmp.ProductID = p.ProductID
        SET     tmp.Status = 'Unchanged'
        WHERE   p.ProductName = tmp.ProductName
        AND     p.Stock = tmp.Stock
        ... 
    

    Identify New

    Like before, use an outer join to identify "new" records.

        UPDATE  Products_Temp tmp LEFT JOIN Products p ON tmp.ProductID = p.ProductID
        SET     tmp.Status = 'New'
        WHERE   p.ProductID IS NULL
    

    By process of elimination, all other records in the temp table are "changed". Once you have calculated the statuses, you can update the Products table:

        /*  update changed products */
        UPDATE Products p INNER JOIN Products_Temp tmp ON tmp.ProductID = p.ProductID
        SET    p.ProductName = tmp.ProductName
               , p.Stock = tmp.Stock
               , ...
               , p.DateLastChanged = now()
               , p.IsDiscontinued = 0
        WHERE  tmp.status = 'Changed'
    
        /*  insert new products */
        INSERT INTO Products ( ProductName, Stock, DateLastChanged, IsDiscontinued, .. )
        SELECT tmp.ProductName, tmp.Stock, now() AS DateLastChanged, 0 AS IsDiscontinued, ...
        FROM   Products_Temp tmp
        WHERE  tmp.Status = 'New'
    
        /* flag deleted records */
        UPDATE Products p LEFT JOIN Products_Temp tmp ON tmp.ProductID = p.ProductID
        SET    p.DateLastChanged = now()
               , p.IsDiscontinued = 1
        WHERE  tmp.ProductID IS NULL
    

提交回复
热议问题