Updating a table to create unique ids in from a substring in PostgreSQL

问题

I have table1 with following columns and example of data:

id, condition1, condition2, condition3, target_id
1, Westminster, Abbey Road, NW1 1FS, null
2, Westminster, Abbey Road, NW1 1FG, null
3, Westminster, China Road, NW1 1FG, null
4, Wandsworth, China Road, SE5 3LG, null
5, Wandsworth, China Road, SE5 3LS, null

Intended result for the target_id would be:

 id, condition1, condition2, condition3, target_id
1, Westminster, Abbey Road, NW1 1FS, 1
2, Westminster, Abbey Road, NW1 1FG, 1
3, Westminster, China Road, NW1 1FG, 2
4, Wandsworth, China Road, SE5 3LG, 3
5, Wandsworth, China Road, SE5 3LS, 3

I'm trying to update target_id with a unique identity based on grouping condition1, condition2 and first characters of condition3

Essentially what I'm trying to do I think looks something like

update table1 set target_id = (select "unique id" from table1 group by condition1, condition2, left(condition3 ,5)

The goal then would be for every id, I would have a target_id which matches its set of characteristics from the 3 condition columns. How to achieve this?

回答1:

Your question suggests that you want a query like this:

update table1 t1
    set target_id = (select "unique id"
                     from table1 tt1
                     where tt1.condition1 = t1.condition1 and
                           tt1.condition2 = t1.condition2 and
                           left(tt1.condition3, 5) = left(t1.condition3, 5)
                    );

However, this will likely return an error of the sort "subquery returns more than one row". To fix that, you need a limit 1 or aggregation function. Something like:

update table1 t1
    set target_id = (select max("unique id")
                     from table1 tt1
                     where tt1.condition1 = t1.condition1 and
                           tt1.condition2 = t1.condition2 and
                           left(tt1.condition3, 5) = left(t1.condition3, 5)
                    );

EDIT:

If you just want to enumerate things, you can use dense_rank():

update table1 t1
    set target_id = tt1.seqnum
    from (select t1.*,
                 dense_rank() over (order by condition1, condition2, left(condition3, 5)) as seqnum
          from table1 t1
         ) tt1
    where tt1.id = t1.id;

来源：https://stackoverflow.com/questions/59984975/updating-a-table-to-create-unique-ids-in-from-a-substring-in-postgresql

标签

sql

postgresql