PostgreSQL and ActiveRecord subselect for race condition

白昼怎懂夜的黑 提交于 2020-01-05 01:10:34

问题


I'm experiencing a race condition in ActiveRecord with PostgreSQL where I'm reading a value then incrementing it and inserting a new record:

num = Foo.where(bar_id: 42).maximum(:number)
Foo.create!({
  bar_id: 42,
  number: num + 1
}) 

At scale, multiple threads will simultaneously read then write the same value of number. Wrapping this in a transaction doesn't fix the race condition because the SELECT doesn't lock the table. I can't use an auto increment, because number is not unique, it's only unique given a certain bar_id. I see 3 possible fixes:

  • Explicitly use a postgres lock (a row-level lock?)
  • Use a unique constraint and retry on fails (yuck!)
  • Override save to use a subselect, I.E.

    INSERT INTO foo (bar_id, number) VALUES (42, (SELECT MAX(number) + 1 FROM foo WHERE bar_id = 42));

All these solutions seem like I'd be reimplementing large parts of ActiveRecord::Base#save! Is there an easier way?

UPDATE: I thought I found the answer with Foo.lock(true).where(bar_id: 42).maximum(:number) but that uses SELECT FOR UDPATE which isn't allowed on aggregate queries

UPDATE 2: I've just been informed by our DBA, that even if we could do INSERT INTO foo (bar_id, number) VALUES (42, (SELECT MAX(number) + 1 FROM foo WHERE bar_id = 42)); that doesn't fix anything, since the SELECT runs in a different lock than the INSERT


回答1:


Your options are:

  • Run in SERIALIZABLE isolation. Interdependent transactions will be aborted on commit as having a serialization failure. You'll get lots of error log spam, and you'll be doing lots of retries, but it'll work reliably.

  • Define a UNIQUE constraint and retry on failure, as you noted. Same issues as above.

  • If there is a parent object, you can SELECT ... FOR UPDATE the parent object before doing your max query. In this case you'd SELECT 1 FROM bar WHERE bar_id = $1 FOR UPDATE. You are using bar as a lock for all foos with that bar_id. You can then know that it's safe to proceed, so long as every query that's doing your counter increment does this reliably. This can work quite well.

    This still does an aggregate query for each call, which (per next option) is unnecessary, but at least it doesn't spam the error log like the above options.

  • Use a counter table. This is what I'd do. Either in bar, or in a side-table like bar_foo_counter, acquire a row ID using

    UPDATE bar_foo_counter SET counter = counter + 1
    WHERE bar_id = $1 RETURNING counter
    

    or the less efficient option if your framework can't handle RETURNING:

    SELECT counter FROM bar_foo_counter
    WHERE bar_id = $1 FOR UPDATE;
    
    UPDATE bar_foo_counter SET counter = $1;
    

    Then, in the same transaction, use the generated counter row for the number. When you commit, the counter table row for that bar_id gets unlocked for the next query to use. If you roll back, the change is discarded.

I recommend the counter approach, using a dedicated side table for the counter instead of adding a column to bar. That's cleaner to model, and means you create less update bloat in bar, which can slow down queries to bar.



来源:https://stackoverflow.com/questions/32382512/postgresql-and-activerecord-subselect-for-race-condition

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!