UPDATE with ORDER BY

回眸只為那壹抹淺笑 提交于 2019-11-28 12:01:38
Erwin Brandstetter

UPDATE with ORDER BY

As to the question raised ion the title: There is no ORDER BY in an SQL UPDATE command. Postgres updates rows in arbitrary order. But you have (limited) options to decide whether constraints are checked after each row, after each statement or at the end of the transaction. You can avoid duplicate key violations for intermediate states with a DEFERRABLE constraint.

I am quoting what we worked out under this question:
Constraint defined DEFERRABLE INITIALLY IMMEDIATE is still DEFERRED?

  • NOT DEFERRED constraints are checked after each row.

  • DEFERRABLE constraints set to IMMEDIATE (INITIALLY IMMEDIATE or via SET CONSTRAINTS) are checked after each statement.

There are limitations, though. Foreign key constraints require non-deferrable constraints on the target column(s).

The referenced columns must be the columns of a non-deferrable unique or primary key constraint in the referenced table.

Workaround

Updated after question update.
Assuming "sequence" is never negative in normal operation, you can avoid unique errors like this:

UPDATE tbl SET "sequence" = ("sequence" + 1) * -1
WHERE  "CableLine" = 2;

UPDATE tbl SET "sequence" = "sequence" * -1
WHERE  "CableLine" = 2
AND    "sequence" < 0;

With a non-deferrable constraint (default), you have to run two separate transactions to make this work. Run the commands in quick succession to avoid concurrency issues. The solution is obviously not fit for heavy concurrent load.

Aside:
It's OK to skip the key word AS for table aliases, but it's discouraged to do the same for column aliases.

I'd advice not to use SQL key words as identifiers, even though that's allowed.

Avoid the problem

On a bigger scale or for databases with heavy concurrent load, it's wiser to use a serial column for relative ordering of rows. You can generate numbers starting with 1 and no gaps with the window function row_number() in a view or query. Consider this related answer:
Is it possible to use a PG sequence on a per record label?

alexkovelsky

UPDATE with ORDER BY:

UPDATE thetable 
  SET columntoupdate=yourvalue 
 FROM (SELECT rowid, 'thevalue' AS yourvalue 
         FROM thetable 
        ORDER BY rowid
      ) AS t1 
WHERE thetable.rowid=t1.rowid;

UPDATE order is still random (I guess), but the values supplied to UPDATE command are matched by thetable.rowid=t1.rowid condition. So what I am doing is, first selecting the 'updated' table in memory, it's named t1 in the code above, and then making my physical table to look same as t1. And the update order does not matter anymore.

As for true ordered UPDATE, I don't think it could be useful to anyone.

Update with Order By
Declare 
v number;
cursor c1 is 
    Select col2 from table1 order by col2;
    begin
    v:=0;
     for c in c1
     loop
    update table1 
    set col1 =v+1
    where col2 = c.col2;
    end loop;
    commit;
    END;

Lazy Way, (aka not fastest or best way)

CREATE OR REPLACE FUNCTION row_number(table_name text, update_column text, start_value integer, offset_value integer, order_by_column text, order_by_descending boolean)
  RETURNS void AS
$BODY$
DECLARE
    total_value integer;
    my_id text;
    command text;
BEGIN
total_value = start_value;
    command = 'SELECT ' || order_by_column || ' FROM ' || table_name || ' ORDER BY '  || order_by_column;

    if (order_by_descending) THEN
        command = command || ' desc';
    END IF;

    FOR  my_id in  EXECUTE command LOOP
        command = 'UPDATE ' || table_name || ' SET  ' || update_column || ' = ' || total_value || ' WHERE ' || order_by_column || ' = ' ||  my_id|| ';';

        EXECUTE command;
        total_value = total_value + offset_value;
    END LOOP;
END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE
  COST 100;

Example

SELECT row_number('regispro_spatial_2010.ags_states_spatial', 'order_id', 10,1, 'ogc_fid', true)

This worked for me:

[update statement here] OPTION (MAXDOP 1) -- prevent row size from causing use of an eager spool, which mutilates the order in which records are updated.

I use a clustered int index in sequential order (generating one if needed) and hadn't had a problem until recently, and even then only on small rowsets that (counterintuitively) the query plan optimizer decided to use a lazy spool on.

Theoretically I could use the new option to disallow spool use, but I find maxdop simpler.

I am in a unique situation because the calculations are isolated (single user). A different situation may require an alternative to using maxdop limit to avoid contention.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!