问题
I have a table like this, where, suppose for the sake of an example, NAME
is a unique identifier.
NAME AGE VALUE
Jack Under 65 3
Jack 66-74 5
John 66-74 7
John Over 75 9
Gill 25-35 11
Some NAME
s have more than one AGE
, which is undesirable, as this is due to dirtiness of the data.
My aim is to update the duplicates only to have one AGE
within each NAME
. The desired output is thus:
NAME AGE VALUE
Jack Under 65 3
Jack Under 65 5
John 66-74 7
John 66-74 9
Gill 25-35 11
Something like this UPDATE statement should work, but it doesn't.
UPDATE table t1
SET t1.age=MAX(t1.age)
WHERE EXISTS (SELECT COUNT(t2.AGE)
FROM table t2
WHERE t1.NAME=t2.NAME
GROUP BY t2.NAME
HAVING COUNT(t2.AGE) > 1)
SQL Error: ORA-00934: group function is not allowed here
Second issue
Even if I got the above statement to work, there is a second issue. The idea there is to use the MAX
(or MIN
) function on strings to set the same value for all repeats within a group.
But unfortunately, this too would not quite work as desired. For consistency, ideally an age would default to the lowest age group. But because MAX/MIN
compare alphabetic order on strings, this would give, e.g.:
- "66-74" and "Under 65" => MAX="Under 65" -- Lowest
- "66-74" and "Over 75" => MAX="Over 75" -- Highest
There are only four age groups, would it be possible to specify a custom order?
- NB1: I am using Oracle SQL.
- NB2: I do not mind if there is a way to achieve the result using a SELECT instead of an UPDATE statement.
Reproducible example
SELECT 'Jack' as NAME, 'Under 65' as AGE, 3 as VALUE from dual
UNION ALL
SELECT 'Jack' as NAME, '66-74' as AGE, 5 as VALUE from dual
UNION ALL
SELECT 'John' as NAME, '66-74' as AGE, 7 as VALUE from dual
UNION ALL
SELECT 'John' as NAME, 'Over 75' as AGE, 9 as VALUE from dual
UNION ALL
SELECT 'Gill' as NAME, '25-35' as AGE, 11 as VALUE from dual
回答1:
You can define custom order with case when
clause and then use analytic max()
. This worked for given examples:
update t1 set age = (
select max(age) keep (dense_rank last
order by case when age = 'Over 75' then 1
when age = '66-74' then 2
when age = 'Under 65' then 3
when age = '25-35' then 4
end)
from t1 tx where tx.name = t1.name )
来源:https://stackoverflow.com/questions/45980467/update-duplicate-rows-only-with-a-max-function-in-sql