why allshortestpath so slow?

南楼画角 提交于 2019-12-11 15:27:36

问题


I create some graph database with python and neo4j library. Graph have 50k nodes and 100k relationships.

How creating nodes:

CREATE (user:user {task_id: %s, id: %s, root: 1, private: 0})

How creating relationships:

 MATCH (root_user), (friend_user) WHERE root_user.id = %s
                                  AND root_user.task_id = %s  
                                  AND friend_user.id = %s
                                  AND friend_user.task_id = %s
                    CREATE (root_user)-[r: FRIEND_OF]->(friend_user) RETURN root_user, friend_user 

How i search all path between nodes:

MATCH (start_user:user {id: %s, task_id: %s}), 
      (end_user:user {id: %s, task_id: %s}), 
      path = allShortestPaths((start_user)-[*..3]-(end_user)) RETURN path

Soo its very slow, around 30-60 min on 50k graph. And i cant understand why. I try to create index like this:

CREATE INDEX ON :user(id, task_id)

but its not help. Can you help me? Thanks.


回答1:


You should never generate a long Cypher query that contains N slight variations of essentially the same Cypher code. That is very slow and takes up a lot of memory.

Instead, you should be passing parameters to a much simpler Cypher query.

For example, when creating your nodes, you could pass a data parameter to the following Cypher code:

UNWIND $data AS d
CREATE (user:user {task_id: d.taskId, id: d.id, root: 1, private: 0})

The data parameter value that you pass would be a list of maps, and each map would contain a taskId and id. The UNWIND clause "unwinds" the data list into individual d maps. This would be much faster.

Something similar needs to be done with your relationship-creation code.

In addition, in order to use any of your :user indexes, your MATCH clause MUST specify the :user label in the relevant node patterns. Otherwise, you are asking Cypher to scan all nodes, regardless of label, and that kind of processing would not be able to take advantage of indexes. For example, the relevant query should start with:

MATCH (root_user:user), (friend_user:user)
...


来源:https://stackoverflow.com/questions/55493547/why-allshortestpath-so-slow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!