Cypher: Query that converts property from int to String is very slow and causes OutOfMemoryError in Neo4j server

怎甘沉沦 提交于 2020-01-05 04:15:07

问题


I need to migrate the type of a numerical property to be of type String. For that I wrote the simple following query:

MATCH (n:Entity) SET n.id=toString(n.id) RETURN n

It matches about 1,2 million entities (according to EXPLAIN), so I didn't expect it to be fast. However, it didn't terminate after more than 5 hours. In the meantime neo4j server (community, 3.0.4) ran at close to 100% load.

I have this configured in the corresponding neo4j.conf:

dbms.memory.heap.initial_size=4g
dbms.memory.heap.max_size=4g
dbms.jvm.additional=-XX:+UseG1GC

After only a few minute of runtime, I could see reports about GarbageCollection in the logs:

[o.n.k.i.c.MonitorGc] GC Monitor: Application threads blocked for 277ms.

later it got worse:

[o.n.k.i.c.MonitorGc] GC Monitor: Application threads blocked for 53899ms.

Eventually the following appeared:

 [o.n.b.v.r.i.c.SessionWorker] Worker for session '10774fef-eed2-4593-9a20-732d9103e576' crashed: Java heap space Java heap space
java.lang.OutOfMemoryError: Java heap space
[o.n.b.v.r.i.c.SessionWorker] Fatal, worker for session '10774fef-eed2-4593-9a20-732d9103e576' crashed. Please contact your support representative if you are unable to resolve this. Java heap space java.lang.OutOfMemoryError: Java heap space

The total available heap should suffice from my previous experience as I have run way "heavier" queries before without problems. I rather assume that the query is cause for the poor performance. However I don't see how to improve it. Actually the migration doesn't have to happen in one query or transaction. To my knowledge it is not possible though to "batch" it. Any ideas?


回答1:


For one, I don't think you need to return the entire 1.2 million set of nodes, you can leave off your return.

And yes, you can batch these using APOC Procedures. In particular you'll want to look at apoc.periodic.iterate() and apoc.periodic.commit().

Here's how you might batch this with apoc.periodic.iterate():

CALL apoc.periodic.iterate(
"MATCH (n:Entity) RETURN n",
"WITH {n} as n SET n.id=toString(n.id)", {batchSize:10000, parallel:true})


来源:https://stackoverflow.com/questions/41728171/cypher-query-that-converts-property-from-int-to-string-is-very-slow-and-causes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!