问题
I'm experiencing a race condition in ActiveRecord with PostgreSQL where I'm reading a value then incrementing it and inserting a new record:
num = Foo.where(bar_id: 42).maximum(:number)
Foo.create!({
bar_id: 42,
number: num + 1
})
At scale, multiple threads will simultaneously read then write the same value of number. Wrapping this in a transaction doesn't fix the race condition because the SELECT doesn't lock the table. I can't use an auto increment, because number is not unique, it's only unique given a certain bar_id. I see 3 possible fixes:
- Explicitly use a postgres lock (a row-level lock?)
- Use a unique constraint and retry on fails (yuck!)
Override save to use a subselect, I.E.
INSERT INTO foo (bar_id, number) VALUES (42, (SELECT MAX(number) + 1 FROM foo WHERE bar_id = 42));
All these solutions seem like I'd be reimplementing large parts of ActiveRecord::Base#save! Is there an easier way?
UPDATE:
I thought I found the answer with Foo.lock(true).where(bar_id: 42).maximum(:number) but that uses SELECT FOR UDPATE which isn't allowed on aggregate queries
UPDATE 2:
I've just been informed by our DBA, that even if we could do INSERT INTO foo (bar_id, number) VALUES (42, (SELECT MAX(number) + 1 FROM foo WHERE bar_id = 42)); that doesn't fix anything, since the SELECT runs in a different lock than the INSERT
回答1:
Your options are:
Run in
SERIALIZABLEisolation. Interdependent transactions will be aborted on commit as having a serialization failure. You'll get lots of error log spam, and you'll be doing lots of retries, but it'll work reliably.Define a
UNIQUEconstraint and retry on failure, as you noted. Same issues as above.If there is a parent object, you can
SELECT ... FOR UPDATEthe parent object before doing yourmaxquery. In this case you'dSELECT 1 FROM bar WHERE bar_id = $1 FOR UPDATE. You are usingbaras a lock for allfoos with thatbar_id. You can then know that it's safe to proceed, so long as every query that's doing your counter increment does this reliably. This can work quite well.This still does an aggregate query for each call, which (per next option) is unnecessary, but at least it doesn't spam the error log like the above options.
Use a counter table. This is what I'd do. Either in
bar, or in a side-table likebar_foo_counter, acquire a row ID usingUPDATE bar_foo_counter SET counter = counter + 1 WHERE bar_id = $1 RETURNING counteror the less efficient option if your framework can't handle
RETURNING:SELECT counter FROM bar_foo_counter WHERE bar_id = $1 FOR UPDATE; UPDATE bar_foo_counter SET counter = $1;Then, in the same transaction, use the generated counter row for the
number. When you commit, the counter table row for thatbar_idgets unlocked for the next query to use. If you roll back, the change is discarded.
I recommend the counter approach, using a dedicated side table for the counter instead of adding a column to bar. That's cleaner to model, and means you create less update bloat in bar, which can slow down queries to bar.
来源:https://stackoverflow.com/questions/32382512/postgresql-and-activerecord-subselect-for-race-condition