Neo4j - How do you do parallel insertion into a linked list via Cypher over REST?

你离开我真会死。 提交于 2019-12-08 02:51:33

问题


I want to create a linked list using Cypher over REST.

If I create the head of the list using the following query:

MERGE (headNode:HEAD {list:"mylist"})
WITH headNode
MERGE headNode-[:LINK]->(headNode)
RETURN headNode

And then do the insert using this query:

MERGE (headNode:HEAD {list:"mylist"})-[old:LINK]->after
DELETE old
CREATE headNode-[:LINK]->(newNode:LINKNODE { number : {nodeNumber} })-[:LINK]->after

Then everything is fine as long as I don't run multiple insert queries in parallel. But when I do I start to get consequences that are bad. I either get (depending on the timing) an error like:

Error: Relationship 391112 not found

Or I get multiple linked lists snaking out of the head node. I have set up a test node.js project that replicates the problem here.

How does one create a linked list in Neo4j that can handle parallel insertion?


回答1:


I recommend not trying to second guessing how the locking works. Let's instead build something with the better-documented tools and features at our disposal.

Say we have a unique constraint like this:

create constraint on (n:LINK) assert n.list_head_id is unique
;

Then we can concurrently insert at the head of a list (potentially creating it in the process) like this:

merge (current_head:LINK {list_head_id: {list_id}})
on create set current_head.is_sentinel = true
remove current_head.list_head_id
create (new_head:LINK {list_head_id: {list_id}})-[:NEXT]->current_head
;

This kind of list always ends with a sentinel link, so you need to ignore that when you query it.

Now, the above can produce deadlock exceptions that will fail the transaction. Potentially a lot of them. So you need to make sure to retry those transactions that fail with deadlocks.

I wrote my own test in Java to demonstrate this approach. Note that in this test, I introduce a small random delay between transactions to reduce the probability of deadlocks. This is because the test is using an embedded database, so there’s no network or anything to otherwise limit the amount and speed of the concurrency I can throw at it: https://gist.github.com/chrisvest/9033600




回答2:


In order to do parallel insert you need to take a lock on the node(s) in question:

MATCH (headNode:HEAD {list:"mylist"})-[old:LINK]->after
REMOVE headNode._lock_
REMOVE after._lock_
DELETE old
CREATE headNode-[:LINK]->(newNode:LINKNODE { number : {nodeNumber} })-[:LINK]->after
RETURN headNode, newNode

the lines

REMOVE headNode._lock_
REMOVE after._lock_

remove a non-existent property, when a property is removed a lock is taken. Now if the query can not insert due to the lock you will get a Neo.DatabaseError.Statement.ExecutionFailure which means the insert failed and you can re-run the query.



来源:https://stackoverflow.com/questions/21692829/neo4j-how-do-you-do-parallel-insertion-into-a-linked-list-via-cypher-over-rest

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!