GraphX - Retrieving all nodes from a path

痴心易碎 提交于 2019-11-26 22:10:34

问题


In GraphX, is there a way to retrieve all the nodes and edges that are on a path that are of a certain length?

More specifically, I would like to get all the 10-step paths from A to B. For each path, I would like to get the list of nodes and edges.

Thanks.


回答1:


Disclaimer: This is only intended to show GraphFrames path filtering capabilities.

Well, theoretically speaking it is possible. You can use GraphFrames patterns to find paths. Lets assume your data looks as follows:

import org.graphframes.GraphFrame

val nodes = "abcdefghij".map(c =>Tuple1(c.toString)).toDF("id")

val edges = Seq(
   // Long path
  ("a", "b"), ("b", "c"), ("c", "d"),  ("d", "e"), ("e", "f"),
  // and some random nodes
  ("g", "h"), ("i", "j"), ("j", "i")
).toDF("src", "dst")

val gf = GraphFrame(nodes, edges)

and you want to find all paths with at least 5 nodes.

You can construct following path pattern:

val path = (1 to 4).map(i => s"(n$i)-[e$i]->(n${i + 1})").mkString(";")
// (n1)-[e1]->(n2);(n2)-[e2]->(n3);(n3)-[e3]->(n4);(n4)-[e4]->(n5)

and filter expression to avoid cycles:

val expr = (1 to 5).map(i => s"n$i").combinations(2).map {
  case Seq(i, j) => col(i) !== col(j)
}.reduce(_ && _)

Finally quick check:

gf.find(path).where(expr).show
// +-----+---+---+-----+---+-----+---+-----+---+
// |   e1| n1| n2|   e2| n3|   e3| n4|   e4| n5|
// +-----+---+---+-----+---+-----+---+-----+---+
// |[a,b]|[a]|[b]|[b,c]|[c]|[c,d]|[d]|[d,e]|[e]|
// |[b,c]|[b]|[c]|[c,d]|[d]|[d,e]|[e]|[e,f]|[f]|
// +-----+---+---+-----+---+-----+---+-----+---+


来源:https://stackoverflow.com/questions/37417469/graphx-retrieving-all-nodes-from-a-path

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!