Optimizing high volume batch inserts into Neo4j using REST

筅森魡賤 提交于 2019-11-30 05:28:17

With 2.0 I would use the transactional endpoint to create your statements in batches, e.g. 100 or 1000 per http request and about 30k-50k per transaction (until you commit).

See this for the format of the new streaming, transactional endpoint:

http://docs.neo4j.org/chunked/milestone/rest-api-transactional.html

Also for such a high performance, continuous insertion endpoint I heartily recommend writing a server extension which would run against the embedded API and can easily insert 10k or more nodes and relationships per second, see here for the documentation:

http://docs.neo4j.org/chunked/milestone/server-unmanaged-extensions.html

For pure inserts you don't need Cypher. And for concurrency, just take a lock at a well known (per subgraph that you are inserting) node so that concurrent inserts are no issue, you can do that with tx.acquireWriteLock() or by removing a non-existent property from a node (REMOVE n.__lock__).

For another example of writing an unmanaged extension (but one that uses cypher), check out this project. It even has a mode that might help you (POSTing CSV files to the server endpoint to be executed using a cypher statement per row).

https://github.com/jexp/cypher-rs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!