Why DFS and not BFS for finding cycle in graphs

∥☆過路亽.° 提交于 2019-11-27 06:00:43
Mark Byers

Depth first search is more memory efficient than breadth first search as you can backtrack sooner. It is also easier to implement if you use the call stack but this relies on the longest path not overflowing the stack.

Also if your graph is directed then you have to not just remember if you have visited a node or not, but also how you got there. Otherwise you might think you have found a cycle but in reality all you have is two separate paths A->B but that doesn't mean there is a path B->A. For example,

If you do BFS starting from 0, it will detect as cycle is present but actually there is no cycle.

With a depth first search you can mark nodes as visited as you descend and unmark them as you backtrack. See comments for a performance improvement on this algorithm.

For the best algorithm for detecting cycles in a directed graph you could look at Tarjan's algorithm.

  1. DFS is easier to implement
  2. Once DFS finds a cycle, the stack will contain the nodes forming the cycle. The same is not true for BFS, so you need to do extra work if you want to also print the found cycle. This makes DFS a lot more convenient.

A BFS could be reasonable if the graph is undirected (be my guest at showing an efficient algorithm using BFS that would report the cycles in a directed graph!), where each "cross edge" defines a cycle. If the cross edge is {v1, v2}, and the root (in the BFS tree) that contains those nodes is r, then the cycle is r ~ v1 - v2 ~ r (~ is a path, - a single edge), which can be reported almost as easily as in DFS.

The only reason to use a BFS would be if you know your (undirected) graph is going to have long paths and small path cover (in other words, deep and narrow). In that case, BFS would require proportionally less memory for its queue than DFS' stack (both still linear of course).

In all other cases, DFS is clearly the winner. It works on both directed and undirected graphs, and it is trivial to report the cycles - just concat any back edge to the path from the ancestor to the descendant, and you get the cycle. All in all, much better and practical than BFS for this problem.

If you place a cycle at a random spot in a tree, DFS will tend to hit the cycle when it's covered about half the tree, and half the time it will have already traversed where the cycle goes, and half the time it will not (and will find it on average in half the rest of the tree), so it will evaluate on average about 0.5*0.5 + 0.5*0.75 = 0.625 of the tree.

If you place a cycle at a random spot in a tree, BFS will tend to hit the cycle only when it's evaluated the layer of the tree at that depth. Thus, you usually end up having to evaluate the leaves of a balance binary tree, which generally results in evaluating more of the tree. In particular, 3/4 of the time at least one of the two links appear in the leaves of the tree, and on those cases you have to evaluate on average 3/4 of the tree (if there is one link) or 7/8 of the tree (if there are two), so you're already up to an expectation of searching 1/2*3/4 + 1/4*7/8 = (7+12)/32 = 21/32 = 0.656... of the tree without even adding the cost of searching a tree with a cycle added away from the leaf nodes.

In addition, DFS is easier to implement than BFS. So it's the one to use unless you know something about your cycles (e.g. cycles are likely to be near the root from which you search, at which point BFS gives you an advantage).

BFS wont work for a directed graph in finding cycles. Consider A->B and A->C->B as paths from A to B in a graph. BFS will say that after going along one of the path that B is visited. When continuing to travel the next path it will say that marked node B has been again found,hence, a cycle is there. Clearly there is no cycle here.

Spiker

To prove that a graph is cyclic you just need to prove it has one cycle(edge pointing towards itself either directly or indirectly).

In DFS we take one vertex at a time and check if it has cycle. As soon as a cycle is found we can omit checking other vertices.

In BFS we need to keep track of many vertex edges simultaneously and more often than not at the end you find out if it has cycle. As the size of the graph grows BFS requires more space, computation and time compared to DFS.

It sort of depends if you are talking about recursive or iterative implementations.

Recursive-DFS visits every node twice. Iterative-BFS visits every node once.

If you want to detect a cycle, you need to investigate the nodes both before and after you add their adjacencies -- both when you "start" on a node and when you "finish" with a node.

This requires more work in Iterative-BFS so most people choose Recursive-DFS.

Note that a simple implementation of Iterative-DFS with, say, std::stack has the same problem as Iterative-BFS. In that case, you need to place dummy elements into the stack to track when you "finish" working on a node.

See this answer for more details on how Iterative-DFS requires additional work to determine when you "finish" with a node (answered in the context of TopoSort):

Topological sort using DFS without recursion

Hopefully that explains why people favor Recursive-DFS for problems where you need to determine when you "finish" processing a node.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!