问题
I have a table which looks like this:
mysql> SELECT * FROM Colors;
╔════╦══════════╦════════╦════════╦════════╦════════╦════════╦════════╗
║ ID ║ USERNAME ║ RED ║ GREEN ║ YELLOW ║ BLUE ║ ORANGE ║ PURPLE ║
╠════╬══════════╬════════╬════════╬════════╬════════╬════════╬════════╣
║ 1 ║ joe ║ 1 ║ (null) ║ 1 ║ (null) ║ (null) ║ (null) ║
║ 2 ║ joe ║ 1 ║ (null) ║ (null) ║ (null) ║ 1 ║ (null) ║
║ 3 ║ bill ║ 1 ║ 1 ║ 1 ║ (null) ║ (null) ║ 1 ║
║ 4 ║ bill ║ (null) ║ 1 ║ (null) ║ 1 ║ (null) ║ (null) ║
║ 5 ║ bill ║ (null) ║ 1 ║ (null) ║ (null) ║ (null) ║ (null) ║
║ 6 ║ bob ║ (null) ║ (null) ║ (null) ║ 1 ║ (null) ║ (null) ║
║ 7 ║ bob ║ (null) ║ (null) ║ (null) ║ (null) ║ (null) ║ 1 ║
║ 8 ║ bob ║ 1 ║ (null) ║ (null) ║ (null) ║ (null) ║ (null) ║
╚════╩══════════╩════════╩════════╩════════╩════════╩════════╩════════╝
I would like to run an UPDATE
and DELETE
which would find and remove duplicates and consolidate the records such that we would end with this as the result.
mysql> SELECT * FROM Colors;
╔════╦══════════╦═════╦════════╦════════╦════════╦════════╦════════╗
║ ID ║ USERNAME ║ RED ║ GREEN ║ YELLOW ║ BLUE ║ ORANGE ║ PURPLE ║
╠════╬══════════╬═════╬════════╬════════╬════════╬════════╬════════╣
║ 1 ║ joe ║ 1 ║ (null) ║ 1 ║ (null) ║ 1 ║ (null) ║
║ 3 ║ bill ║ 1 ║ 1 ║ 1 ║ 1 ║ (null) ║ 1 ║
║ 6 ║ bob ║ 1 ║ (null) ║ (null) ║ 1 ║ (null) ║ 1 ║
╚════╩══════════╩═════╩════════╩════════╩════════╩════════╩════════╝
I know I could easily do this with a script, but in the interest of learning and understanding MySQL better I would like to learn how to do this using pure SQL
.
回答1:
This is only a projection. It doesn't update the table nor delete some data.
SELECT MIN(ID) ID,
Username,
MAX(Red) max_Red,
MAX(Green) max_Green,
MAX(Yellow) max_Yellow,
MAX(Blue) max_Blue,
MAX(Orange) max_Orange,
MAX(Purple) max_Purple
FROM Colors
GROUP BY Username
- SQLFiddle Demo
UPDATE
if you really want to delete those records, you need to run UPDATE statement first before you can delete the records
UPDATE Colors a
INNER JOIN
(
SELECT MIN(ID) min_ID,
Username,
MAX(Red) max_Red,
MAX(Green) max_Green ,
MAX(Yellow) max_Yellow,
MAX(Blue) max_Blue,
MAX(Orange) max_Orange,
MAX(Purple) max_Purple
FROM Colors
GROUP BY Username
) b ON a.ID = b.Min_ID
SET a.Red = b.max_Red,
a.Green = b.max_Green,
a.Yellow = b.max_Yellow,
a.Blue = b.max_Blue,
a.Orange = b.max_Orange,
a.Purple = b.max_Purple
Then you can now delete the records,
DELETE a
FROM Colors a
LEFT JOIN
(
SELECT MIN(ID) min_ID,
Username
FROM Colors
GROUP BY Username
) b ON a.ID = b.Min_ID
WHERE b.Min_ID IS NULL
- SQLFiddle Demo
回答2:
Do you really need to update the underlying table? If not (and you simply want the resultset as shown in your example), you could simply group the table:
SELECT MIN(ID) AS ID,
Username AS Username,
MAX(Red) AS Red,
MAX(Green) AS Green,
MAX(Yellow) AS Yellow,
MAX(Blue) AS Blue,
MAX(Orange) AS Orange,
MAX(Purple) AS Purple
FROM Colors
GROUP BY Username
See it on sqlfiddle.
回答3:
DELETE FROM Colors c1
WHERE EXISTS (SELECT 1
FROM Colors c2
WHERE c1.Username = c2.Username
AND ((c1.Red IS NULL AND c2.Red IS NULL) OR c1.Red = c2.Red )
AND ((c1.Green IS NULL AND c2.Green IS NULL) OR c1.Green = c2.Green )
AND ((c1.Yellow IS NULL AND c2.Yellow IS NULL) OR c1.Yellow = c2.Yellow)
AND ((c1.Blue IS NULL AND c2.Blue IS NULL) OR c1.Blue = c2.Blue )
AND ((c1.Orange IS NULL AND c2.Orange IS NULL) OR c1.Orange = c2.Orange)
AND ((c1.Purple IS NULL AND c2.Purple IS NULL) OR c1.Purple = c2.Purple)
AND c2.ID < c1.ID
)
The nulls make this a bit more complex, as NULL = NULL is not true but unknown in SQL. If you had 0 and 1, the part before the OR in the color conditions could be omitted.
来源:https://stackoverflow.com/questions/14404259/mysql-consolidate-duplicate-data-records-via-update-delete