问题
Background: I am migrating from postgreSQL to Vertica and found, that there are some issues in IDENTITY
or AUTO_INCREMENT
columns. One of these issues is, that vertica cannot assign values to IDENTITY
columns or alter a column, that already has data into an IDENTITY
column. Therefore I created a sequence and set the default value of the column to be unique doing:
SELECT MAX(id_column) FROM MY_SCHEMA.my_table;
which is 12345
CREATE SEQUENCE MY_SCHEMA.seq_id_column MINVALUE 12346 CACHE 1;
ALTER TABLE MY_SCHEMA.my_table
ALTER COLUMN id_column SET DEFAULT(MY_SCHEMA.seq_id_column.nextval);
ALTER TABLE MY_SCHEMA.log ADD UNIQUE(id_column);
Which works as expected. In this case, I have the cache deactivated, as I am on a single node installation and I want my ID column to be contiguous. However, this is not an option on a cluster installation as the needed lock leads to a bottleneck.
Question: In a vertica cluster with several nodes, how can I access the ID of the last insert in a session (without an additional select)?
E.g. in postgreSQL I could do something like
INSERT INTO MY_SCHEMA.my_table RETURNING id_column;
which does not work in Vertica. Furthermore, the LAST_INSERT_ID()
function of Vertica does not work for named sequences. I also feel, that querying the current_value
of MY_SCHEMA.seq_id_column
could be giving wrong results due to caching, but I am unsure about this.
Why no additional SELECT?
To my knowledge, the select will only give correct values after a commit. I cannot do a commit after every single insert due to performance.
回答1:
The comments from LukStorms pointed me in the right direction.
The NEXTVAL()
function (as far as I have tested) gives contiguous values in the case, where one single session queries them. Furthermore, on concurrent access, if issued after an insert, CURRVAL
retrieves the cached value, which is guaranteed to be unique but not necessarily contiguous. As I never call NEXTVAL
anywhere else as in my default clause, this solves the problem for me, although there might be cases, where an additional call to NEXTVAL
between inserts increments the sequence counter.
One case I can think of (and that I will test in the future) is what happens if AUTO COMMIT
is set to OFF
, which is ON
by default for the vertica client drivers.
UPDATE:
This even seems to work with AUTOCOMMIT
being OFF
(shown using the vertica-python
client driver, where C
is the connection and cur
the cursor):
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.fetchall()
--> 1
cur.execute("SELECT CURRVAL('my_schema.my_sequence');")
cur.fetchall()
--> 1
cur.execute("SET SESSION AUTOCOMMIT TO OFF")
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.execute("SELECT NEXTVAL('my_schema.my_sequence');")
cur.execute("SELECT CURRVAL('my_schema.my_sequence');")
cur.fetchall()
--> 4
However, this seems to be unchanged during a rollback of the connection. So the following happens:
C.rollback()
cur.execute("SELECT CURRVAL('my_schema.my_sequence');")
cur.fetchall()
--> 4
来源:https://stackoverflow.com/questions/51597088/how-can-i-get-the-last-issued-sequence-id-in-vertica