问题
I create some graph database with python and neo4j library. Graph have 50k nodes and 100k relationships.
How creating nodes:
CREATE (user:user {task_id: %s, id: %s, root: 1, private: 0})
How creating relationships:
MATCH (root_user), (friend_user) WHERE root_user.id = %s
AND root_user.task_id = %s
AND friend_user.id = %s
AND friend_user.task_id = %s
CREATE (root_user)-[r: FRIEND_OF]->(friend_user) RETURN root_user, friend_user
How i search all path between nodes:
MATCH (start_user:user {id: %s, task_id: %s}),
(end_user:user {id: %s, task_id: %s}),
path = allShortestPaths((start_user)-[*..3]-(end_user)) RETURN path
Soo its very slow, around 30-60 min on 50k graph. And i cant understand why. I try to create index like this:
CREATE INDEX ON :user(id, task_id)
but its not help. Can you help me? Thanks.
回答1:
You should never generate a long Cypher query that contains N slight variations of essentially the same Cypher code. That is very slow and takes up a lot of memory.
Instead, you should be passing parameters to a much simpler Cypher query.
For example, when creating your nodes, you could pass a data
parameter to the following Cypher code:
UNWIND $data AS d
CREATE (user:user {task_id: d.taskId, id: d.id, root: 1, private: 0})
The data
parameter value that you pass would be a list of maps, and each map would contain a taskId
and id
. The UNWIND
clause "unwinds" the data
list into individual d
maps. This would be much faster.
Something similar needs to be done with your relationship-creation code.
In addition, in order to use any of your :user
indexes, your MATCH
clause MUST specify the :user
label in the relevant node patterns. Otherwise, you are asking Cypher to scan all nodes, regardless of label, and that kind of processing would not be able to take advantage of indexes. For example, the relevant query should start with:
MATCH (root_user:user), (friend_user:user)
...
来源:https://stackoverflow.com/questions/55493547/why-allshortestpath-so-slow