# Best algorithm for detecting cycles in a directed graph

## Best algorithm for detecting cycles in a directed graph

### Question

What is the most efficient algorithm for detecting all cycles within a directed graph?

I have a directed graph representing a schedule of jobs that need to be executed, a job being a node and a dependency being an edge. I need to detect the error case of a cycle within this graph leading to cyclic dependencies.

### Accepted Answer

Tarjan's strongly connected components algorithm has `O(|E| + |V|)`

time complexity.

For other algorithms, see Strongly connected components on Wikipedia.

Read more... Read less...

Given that this is a schedule of jobs, I suspect that at some point you are going to *sort* them into a proposed order of execution.

If that's the case, then a *topological sort* implementation may in any case detect cycles. UNIX `tsort`

certainly does. I think it is likely that it is therefore more efficient to detect cycles at the same time as tsorting, rather than in a separate step.

So the question might become, "how do I most efficiently tsort", rather than "how do I most efficiently detect loops". To which the answer is probably "use a library", but failing that the following Wikipedia article:

has the pseudo-code for one algorithm, and a brief description of another from Tarjan. Both have `O(|V| + |E|)`

time complexity.

According to Lemma 22.11 of Cormen et al., *Introduction to Algorithms* (CLRS):

A directed graph G is acyclic if and only if a depth-first search of G yields no back edges.

This has been mentioned in several answers; here I'll also provide a code example based on chapter 22 of CLRS. The example graph is illustrated below.

CLRS' pseudo-code for depth-first search reads:

In the example in CLRS Figure 22.4, the graph consists of two DFS trees: one consisting of nodes *u*, *v*, *x*, and *y*, and the other of nodes *w* and *z*. Each tree contains one back edge: one from *x* to *v* and another from *z* to *z* (a self-loop).

The key realization is that a back edge is encountered when, in the `DFS-VISIT`

function, while iterating over the neighbors `v`

of `u`

, a node is encountered with the `GRAY`

color.

The following Python code is an adaptation of CLRS' pseudocode with an `if`

clause added which detects cycles:

```
import collections
class Graph(object):
def __init__(self, edges):
self.edges = edges
self.adj = Graph._build_adjacency_list(edges)
@staticmethod
def _build_adjacency_list(edges):
adj = collections.defaultdict(list)
for edge in edges:
adj[edge[0]].append(edge[1])
return adj
def dfs(G):
discovered = set()
finished = set()
for u in G.adj:
if u not in discovered and u not in finished:
discovered, finished = dfs_visit(G, u, discovered, finished)
def dfs_visit(G, u, discovered, finished):
discovered.add(u)
for v in G.adj[u]:
# Detect cycles
if v in discovered:
print(f"Cycle detected: found a back edge from {u} to {v}.")
# Recurse into DFS tree
if v not in finished:
dfs_visit(G, v, discovered, finished)
discovered.remove(u)
finished.add(u)
return discovered, finished
if __name__ == "__main__":
G = Graph([
('u', 'v'),
('u', 'x'),
('v', 'y'),
('w', 'y'),
('w', 'z'),
('x', 'v'),
('y', 'x'),
('z', 'z')])
dfs(G)
```

Note that in this example, the `time`

in CLRS' pseudocode is not captured because we're only interested in detecting cycles. There is also some boilerplate code for building the adjacency list representation of a graph from a list of edges.

When this script is executed, it prints the following output:

```
Cycle detected: found a back edge from x to v.
Cycle detected: found a back edge from z to z.
```

These are exactly the back edges in the example in CLRS Figure 22.4.

The simplest way to do it is to *do a depth first traversal (DFT) of the graph*.

If the graph has `n`

vertices, this is a `O(n)`

time complexity algorithm. Since you will possibly have to do a DFT starting from each vertex, the total complexity becomes `O(n^2)`

.

You have to maintain a *stack containing all vertices in the current depth first traversal*, with its first element being the root node. If you come across an element which is already in the stack during the DFT, then you have a cycle.

Start with a DFS: a cycle exists if and only if a *back-edge is discovered during DFS*. This is proved as a result of white-path theorum.

In my opinion, the most understandable algorithm for detecting cycle in a directed graph is the graph-coloring-algorithm.

Basically, the graph coloring algorithm walks the graph in a DFS manner (Depth First Search, which means that it explores a path completely before exploring another path). When it finds a back edge, it marks the graph as containing a loop.

For an in depth explanation of the graph coloring algorithm, please read this article: http://www.geeksforgeeks.org/detect-cycle-direct-graph-using-colors/

Also, I provide an implementation of graph coloring in JavaScript https://github.com/dexcodeinc/graph_algorithm.js/blob/master/graph_algorithm.js