How to improve performance for massive MERGE insert?

问题

I'm trying to insert data from my SQL db into Neo4J. I have a CSV file where every row generates 4-5 entities and some relations between them. Entities might be duplicate between rows and I want to force uniqueness.

What I currently do is:

create constraints for each label to force uniqueness.
iterate the CSV:
- start transaction
- create merge statements for the entities
- create merge statements for the relations
- commit transaction

I got bad results. Then I tried to commit the transaction every X rows (X was 100, 500, 1000 and 5000). It's better now but I still have 2 problems:

it's slow. on average around 1-1.5 seconds per 100 rows. (row = 4-5 entities and 4-5 relations).
it's getting worse as I keep adding data. I usually start with 400-500 ms per 100 rows and after ~5000 rows I'm at ~4-5 seconds per 100 rows.

From what I know, my constraint also creates an index for that field. That's the field that is used when I create the new node with MERGE. Any chance it doesn't use the index?

What's the best practice for improving performance? I saw BatchInserter but wasn't sure if I can use it with MERGE operations.

Thanks

来源：https://stackoverflow.com/questions/22129783/how-to-improve-performance-for-massive-merge-insert

标签

neo4j

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!