Efficient algorithm to find all the paths from A to Z?

前端 未结 4 1303
失恋的感觉
失恋的感觉 2020-12-23 15:01

With a set of random inputs like this (20k lines):

A B
U Z
B A
A C
Z A
K Z
A Q
D A
U K
P U
U P
B Y
Y R
Y U
C R
R Q
A D
Q Z

Find all the pat

4条回答
  •  天涯浪人
    2020-12-23 15:38

    I would proceed recursively where I would build a list of all possible paths between all pairs of nodes.

    I would start by building, for all pairs (X, Y), the list L_2(X, Y) which is the list of paths of length 2 that go from X to Y; that's trivial to build since that's the input list you are given.

    Then I would build the lists L_3(X, Y), recursively, using the known lists L_2(X, Z) and L_2(Z, Y), looping over Z. For example, for (C, Q), you have to try all Z in L_2(C, Z) and L_2(Z, Q) and in this case Z can only be R and you get L_3(C, Q) = {C -> R -> Q}. For other pairs, you might have an empty L_3(X, Y), or there could be many paths of length 3 from X to Y. However you have to be careful here when building the paths here since some of them must be rejected because they have cycles. If a path has twice the same node, it is rejected.

    Then you build L_4(X, Y) for all pairs by combining all paths L_2(X, Z) and L_3(Z, Y) while looping over all possible values for Z. You still remove paths with cycles.

    And so on... until you get to L_17576(X, Y).

    One worry with this method is that you might run out of memory to store those lists. Note however that after having computed the L_4's, you can get rid of the L_3's, etc. Of course you don't want to delete L_3(A, Z) since those paths are valid paths from A to Z.

    Implementation detail: you could put L_3(X, Y) in a 17576 x 17576 array, where the element at (X, Y) is is some structure that stores all paths between (X, Y). However if most elements are empty (no paths), you could use instead a HashMap>, where Pair is just some object that stores (X, Y). It's not clear to me if most elements of L_3(X, Y) are empty, and if so, if it is also the case for L_4334(X, Y).

    Thanks to @Lie Ryan for pointing out this identical question on mathoverflow. My solution is basically the one by MRA; Huang claims it's not valid, but by removing the paths with duplicate nodes, I think my solution is fine.

    I guess my solution needs less computations than the brute force approach, however it requires more memory. So much so that I'm not even sure it is possible on a computer with a reasonable amount of memory.

提交回复
热议问题