Neo4j shortest path (BFS) distances query variants

不羁岁月 提交于 2019-12-07 19:31:01

问题


I do not know Neo4j so, bear with me. I have a big (1M nodes) undirected, unweighted graph. Assume I magically import this graph to Neo4j. Can the Neo4j query engine (cypher) support those the following types of queries?

  • Range queries. Bring all me nodes (and their distances) that are within 3 hops distance from a specific node
  • Bring the shortest-path (BFS) distance (since the graph is undirected and unweighted) between a specific node and a set of nodes.
  • Bring the shortest-path (BFS) distance between a specific node and all other graph nodes.

If those types of queries are actually possible, without a recursive type implementation but straight from Cypher, What type of performance should I expect (few seconds, many seconds or minutes)?


回答1:


Yes, cypher can do all of those things.

Range queries look something like this:

MATCH path=(a:MyNode { name: "Foo"})-[:myRelationshipType*1..20]->(b));

That gives you all "b" nodes that are between 1 and 20 hops from MyNode with the given relationship type. The matched path is a variable, which can have various cypher functions applied to it. So for each path, you can ask how long it is, what's in the middle, and so on. The cypher refcard shows functions that apply to paths to get a sense for what you can do with them.

Shortest-path searches can be found here and use the shortestPath and allShortestPaths functions in cypher.

If you wanted to get the shortest path from something to basically everything else in the graph, you could do that in one query; the shortest path would start with matching a "head" node, and a "tail" node. In the case of finding shortest path from one thing to everything else, the head node match would be your node of interest, and the tail node match would be "any node in the graph". E.g. MATCH (a:MyNodeOfInterest), (b), p=shortestPath((a)-[*]->(b)). So you could do this in one query, but, if you're trying to find shortest paths from one thing to everything else in a one million node graph, that's going to take some time, no matter what graph database you use.

In terms of performance, nobody can really accurately answer that question. It will depend on a lot of different factors like:

  1. Total data volumes
  2. Indexing strategy
  3. Your use/abuse of node labels, and relationship types
  4. Total path lengths
  5. JVM/memory/cache configurations.


来源:https://stackoverflow.com/questions/26162735/neo4j-shortest-path-bfs-distances-query-variants

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!