postgresql - script that uses transaction blocks fails to create all records

问题

Background information

I am trying to understand how postgresql well enough so that I can ensure that my web app will not fail when users are simultaneously trying to update the same record.

For testing purposes, I have created two scripts - one script creates / defines the transaction block and the second script simulates load (and attempts to create conflicts) by calling script 1.

Code

Here's what script 1 looks like:

http://pastebin.com/6BKyx1dW

And here's what script two looks like:

http://pastebin.com/cHUhCBL2

To simulate load on the database and to test for locking issues, I'm calling script 2 from two different commandline windows on my server. I pass in two different sets of parameters so that when i'm analyzing the results in the database i can see which session created which record.

Problem

When I query the database to count how many records each instance of the script created, I am not consistently getting 200 each. I've captured the results from each instance of the script to see if there were any roll backs documented but there are none. So I have two theories.

The server where these scripts are being executed is not robust enough and so the requests aren't making it to the database server...
The database server is silently aborting transactions.

To eliminate theory 1, I'm going to set up two different servers and run the script once from each server, instead of opening up 2 commandlines on 1 server. If the number of records created increases... I guess that'll tell me that performance on the current server is an issue. (the "server" i'm currently running the scripts on is just a glorified desktop... so it could very well be the issue).

Regarding theory number 2, I've been trying to read and understand http://www.postgresql.org/docs/current/static/explicit-locking.html but since I'm not a database expert, it's taking me a little while to digest everything. I do know that with MS SQL Server, if a record is locked by transaction A, transaction B will wait indefinitely until A is finished. With SQLLite, out of the box, transaction B dies. But you can specify the number of milliseconds to wait before retrying.

The last paragraph in the postgresql documentation I've listed above says that postgresql will also wait indefinitely for conflicting locks to be released... but I'm not 100% sure that I'm not messing something up in my sql code.

So my questions are as follows:

am i doing anything obviously wrong in my sql code?
how can i test the sql side of things to see what locks are being used / what's happening under the hood?

EDIT 1

I ran the script again from 2 separate machines. and machine 1 successfully created 122 records and machine 2 created 183.

回答1:

Yes, you are doing something wrong.
Look at simple example.

Session 1

postgres=# select * from user_reservation_table;
 id | usedyesno | userid | uservalue
----+-----------+--------+-----------
  1 | f         |      0 |         1
  2 | f         |      0 |         2
  3 | f         |      0 |         3
  4 | f         |      0 |         4
  5 | f         |      0 |         5
(5 wierszy)


postgres=# \set user 1
postgres=#
postgres=# begin;
BEGIN
postgres=# UPDATE user_reservation_table
postgres-# SET UsedYesNo = true, userid=:user
postgres-# WHERE uservalue IN(
postgres(#    SELECT uservalue FROM user_reservation_table
postgres(#    WHERE UsedYesNo=false Order By id ASC Limit 1)
postgres-# RETURNING uservalue;
 uservalue
-----------
         1
(1 wiersz)


UPDATE 1
postgres=#

Session 2 - at the same time, but only 10 ms later

postgres=# \set user 2
postgres=# begin;
BEGIN
postgres=# UPDATE user_reservation_table
postgres-# SET UsedYesNo = true, userid=:user
postgres-# WHERE uservalue IN(
postgres(#    SELECT uservalue FROM user_reservation_table
postgres(#    WHERE UsedYesNo=false Order By id ASC Limit 1)
postgres-# RETURNING uservalue;

Session 2 hangs ....... and is waiting for something ....

back in Session 1

postgres=# commit;
COMMIT
postgres=#

and again Session 2

 uservalue
-----------
         1
(1 wiersz)


UPDATE 1
postgres=# commit;
COMMIT
postgres=#

Session 2 is not waiting anymore, and finishes it's transaction.

And what is the final outcome ?:

postgres=# select * from user_reservation_table order by id;
 id | usedyesno | userid | uservalue
----+-----------+--------+-----------
  1 | t         |      2 |         1
  2 | f         |      0 |         2
  3 | f         |      0 |         3
  4 | f         |      0 |         4
  5 | f         |      0 |         5
(5 wierszy)

Two users took the same value 1, but only user 2 is registered in the table

====================== EDIT ==================================

In this scenario we can use SELECT .. FOR UPDATE and utilize a way in which postgre reevaluates the query in Read Committed Isolation Level mode,
see documentation: http://www.postgresql.org/docs/9.2/static/transaction-iso.html

UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row. The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row. In the case of SELECT FOR UPDATE and SELECT FOR SHARE, this means it is the updated version of the row that is locked and returned to the client.

In short:
if one session locked the row, and the other session is trying to lock the same row, then the second session will "hang" and will wait for the first session to commit or rollback. When the first session commits the transaction, then the second session will reevaluate the WHERE search condition. If the search condition does not match (because the first transacion changed some columns), then the second session will skip that row, and will process a next row that matches WHERE conditions.

Note: this behaviour is different in Repeatable Read Isolation Levels. In that case the second session will throw error: could not serialize access due to concurrent update, and you must retry athe whole transaction.

Our query may look like:

select id from user_reservation_table
where usedyesno = false
order by id
limit 1
for update ;

and then:

  Update .... where id = (id returned by SELECT ... FOR UPDATE)

Personally, I prefer to test locking scenarious using plain, old console clients (psql for postgree, mysql or SQLPlus for oracle)

So let test our query in psql:

session1 #select * from user_reservation_table order by id;
 id | usedyesno | userid | uservalue
----+-----------+--------+-----------
  1 | t         |      2 |         1
  2 | f         |      0 |         2
  3 | f         |      0 |         3
  4 | f         |      0 |         4
  5 | f         |      0 |         5
(5 wierszy)


session1 #begin;
BEGIN
session1 #select id from user_reservation_table
postgres-# where usedyesno = false
postgres-# order by id
postgres-# limit 1
postgres-# for update ;
 id
----
  2
(1 wiersz)


session1 #update user_reservation_table set usedyesno = true
postgres-# where id = 2;
UPDATE 1
session1 #

Session 1 locked and updated a row id=2

And now session2

session2 #begin;
BEGIN
session2 #select id from user_reservation_table
postgres-# where usedyesno = false
postgres-# order by id
postgres-# limit 1
postgres-# for update ;

Session 2 hangs while trying to lock the row id =2

OK, lets commit session 1

session1 #commit;
COMMIT
session1 #

and look what happens in session 2:

postgres-# for update ;
 id
----
  3
(1 wiersz)

Bingo - session 2 skipped the row id = 2 and selected (and locked) the row id = 3

So our final query might be:

update user_reservation_table
set usedyesno = true
where id = (
   select id from user_reservation_table
   where usedyesno = false
   order by id
   limit 1
   for update
) RETURNING uservalue;

Some reservation - this example is only for your test purpose and it's purpose is to help to understand how locking is working in postgre.
In fact, this query will serialize access to the table, and is not scaleable and probably will perform bad (slow) in multiuser environment.
Imagine that 10 sessions is trying simultanously to get next row from this table - each session will hang and will be waiting until previous session will commit.
So don't use this query in production code.
Do you really want to "find and reserve next value from the table" ? Why ?
If yes, you must have some serialization device (like this query, or, maybe easier to implement, locking the whole table), but this will be a bottleneck.

来源：https://stackoverflow.com/questions/18002211/postgresql-script-that-uses-transaction-blocks-fails-to-create-all-records

标签

postgresql

lua

locking

database-deadlocks