R Igraph Find All Cycles

r igraph find all cycles

It is not directly a function in igraph, but of course you can code it up. To find a cycle, you start at some node, go to some neighboring node and then find a simple path back to the original node. Since you did not provide any sample data, I will illustrate with a simple example.

Sample data

## Sample graph
library(igraph)
set.seed(1234)
g = erdos.renyi.game(7, 0.29, directed=TRUE)
plot(g, edge.arrow.size=0.5)

Sample Graph

Finding Cycles

Let me start with just one node and one neighbor. Node 2 connects to Node 4. So some cycles may look like 2 -> 4 -> (Nodes other than 2 or 4) -> 2. Let's get all of the paths like that.

v1 = 2
v2 = 4
lapply(all_simple_paths(g, v2,v1, mode="out"), function(p) c(v1,p))
[[1]]
[1] 2 4 2
[[2]]
[1] 2 4 3 5 7 6 2
[[3]]
[1] 2 4 7 6 2

We see that there are three cycles starting at 2 with 4 as the second node. (I know that you said length greater than 3. I will come back to that.)

Now we just need to do that for every node v1 and every neighbor v2 of v1.

Cycles = NULL
for(v1 in V(g)) {
    for(v2 in neighbors(g, v1, mode="out")) {
        Cycles = c(Cycles, 
            lapply(all_simple_paths(g, v2,v1, mode="out"), function(p) c(v1,p)))
    }
}

This gives 17 cycles in the whole graph. There are two issues though that you may need to look at depending on how you want to use this. First, you said that you wanted cycles of length greater than 3, so I assume that you do not want the cycles that look like 2 -> 4 -> 2. These are easy to get rid of.

LongCycles = Cycles[which(sapply(Cycles, length) > 3)]

LongCycles has 13 cycles having eliminated the 4 short cycles

2 -> 4 -> 2
4 -> 2 -> 4
6 -> 7 -> 6
7 -> 6 -> 7

But that list points out the other problem. There still are some that you cycles that you might think of as duplicates. For example:

2 -> 7 -> 6 -> 2
7 -> 6 -> 2 -> 7
6 -> 2 -> 7 -> 6

You might want to weed these out. To get just one copy of each cycle, you can always choose the vertex sequence that starts with the smallest vertex number. Thus,

LongCycles[sapply(LongCycles, min) == sapply(LongCycles, `[`, 1)]
[[1]]
[1] 2 4 3 5 7 6 2
[[2]]
[1] 2 4 7 6 2
[[3]]
[1] 2 7 6 2

This gives just the distinct cycles.

Addition regarding efficiency and scalability

I am providing a much more efficient version of the code that I
originally provided. However, it is primarily for the purpose of
arguing that, except for very simple graphs, you will not be able
produce all cycles.

Here is some more efficient code. It eliminates checking many
cases that either cannot produce a cycle or will be eliminated
as a redundant cycle. In order to make it easy to run the tests
that I want, I made it into a function.

## More efficient version
FindCycles = function(g) {
    Cycles = NULL
    for(v1 in V(g)) {
        if(degree(g, v1, mode="in") == 0) { next }
        GoodNeighbors = neighbors(g, v1, mode="out")
        GoodNeighbors = GoodNeighbors[GoodNeighbors > v1]
        for(v2 in GoodNeighbors) {
            TempCyc = lapply(all_simple_paths(g, v2,v1, mode="out"), function(p) c(v1,p))
            TempCyc = TempCyc[which(sapply(TempCyc, length) > 3)]
          TempCyc = TempCyc[sapply(TempCyc, min) == sapply(TempCyc, `[`, 1)]
          Cycles  = c(Cycles, TempCyc)
        }
    }
    Cycles
}

However, except for very simple graphs, there is a combinatorial
explosion of possible paths and so finding all possible cycles is
completely impractical I will illustrate this with graphs much smaller
than the one that you mention in the comments.

First, I will start with some small graphs where the number of edges
is approximately twice the number of vertices. Code to generate my
examples is below but I want to focus on the number of cycles, so I
will just start with the results.

## ecount ~ 2 * vcount
Nodes   Edges   Cycles
10   21    15
20   41    18
30   65    34
40   87   424
50  108  3433
55  117 22956

But you report that your data has approximately 5 times as
many edges as vertices. Let's look at some examples like that.

## ecount ~ 5 * vcount
Nodes  Edges    Cycles
10      48        3511
12      61       10513
14      71      145745

With this as the growth of the number of cycles, using 10K nodes
with 50K edges seems to be out of the question. BTW, it took several
minutes to compute the example with 14 vertices and 71 edges.

For reproducibility, here is how I generated the above data.

set.seed(1234)
g10 = erdos.renyi.game(10, 0.2, directed=TRUE)
ecount(g10)
length(FindCycles(g10))

set.seed(1234)
g20 = erdos.renyi.game(20, 0.095 , directed=TRUE)
ecount(g20)
length(FindCycles(g20))

set.seed(1234)
g30 = erdos.renyi.game(30, 0.056 , directed=TRUE)
ecount(g30)
length(FindCycles(g30))

set.seed(1234)
g40 = erdos.renyi.game(40, 0.042 , directed=TRUE)
ecount(g40)
length(FindCycles(g40))

set.seed(1234)
g50 = erdos.renyi.game(50, 0.038 , directed=TRUE)
ecount(g50)
length(FindCycles(g50))

set.seed(1234)
g55 = erdos.renyi.game(55, 0.035 , directed=TRUE)
ecount(g55)
length(FindCycles(g55))

##########
set.seed(1234)
h10 = erdos.renyi.game(10, 0.55, directed=TRUE)
ecount(h10)
length(FindCycles(h10))

set.seed(1234)
h12 = erdos.renyi.game(12, 0.46, directed=TRUE)
ecount(h12)
length(FindCycles(h12))

set.seed(1234)
h14 = erdos.renyi.game(14, 0.39, directed=TRUE)
ecount(h14)
length(FindCycles(h14))

How to calculate the total from each cycles by using igraph in R?

There's probably a shorter way, but provided your data is as follows (where h is your table with amounts, and all_cycles list with cycles) -

h = data.frame(fr  = c('A','A','X','E','B','W','C','Y'),
               t   = c('B','E','Y','C','A','X','A','W'),
               Amt = c( 40, 30, 55, 10, 33, 78, 21, 90))

all_cycles <- list(
  c(A = 1, E = 3, C = 6, A = 1),
  c(A = 1, B = 4, A = 1),
  c(X = 2, Y = 7, W = 5, X = 2)
)

.. you could do:

library(dplyr)

data.frame(
  Nodes = unlist(lapply(all_cycles, names)),
  Path = unlist(lapply(seq_along(all_cycles), 
                       function(x) rep(paste(names(all_cycles[[x]]), collapse = " - "), 
                                       length(all_cycles[[x]]))))
  ) %>%
  group_by(Path) %>%
  mutate(fr = Nodes, t = lead(Nodes)) %>%
  left_join(h) %>%
  summarise(sumAmt = sum(Amt, na.rm = TRUE), numberOfEdges = sum(!is.na(t)))

To get:

# A tibble: 3 x 3
  Path          sumAmt numberOfEdges
  <fct>          <dbl>         <int>
1 A - B - A         73             2
2 A - E - C - A     61             3
3 X - Y - W - X    223             3

In case first value is always unnamed in the elements of your list, you could do:

data.frame(
  Nodes = unlist(lapply(all_cycles, names)),
  id = unlist(lapply(seq_along(all_cycles), 
                       function(x) rep(x, length(all_cycles[[x]])))), stringsAsFactors = FALSE
  ) %>%
  group_by(id) %>% mutate(Nodes = replace(Nodes, Nodes == "", last(Nodes)),
                          Path = paste(Nodes, collapse = " - ")) %>%
  mutate(fr = Nodes, t = lead(Nodes)) %>%
  group_by(Path, id) %>%
  left_join(h) %>%
  summarise(sumAmt = sum(Amt, na.rm = TRUE), numberOfEdges = sum(!is.na(t)))

Counting 4 and 6-cycles in bipartite R igraph

You can count cycles using igraph's subisomorphism functions. For example, counting 5-cycles:

set.seed(123)
g <- sample_gnm(10,30)

> count_subgraph_isomorphisms(make_ring(5), g)
[1] 3500

This overcounts 5-cycles by 10 as each 5-cycle has 10 automorphisms. Generally, n-cycles have 2n automorphisms. Thus the true count is 350.

> automorphisms(make_ring(5))$group_size
[1] "10"

It will also be quite slow.

With the newly released igraph 1.3.0, we can use the motif finder functionality, which has now been extended to work with 5 and 6 vertex undirected graphs.

Since motifs are induced subgraphs, but we are looking for all cycles, not just induced ones, we first need to check how many 5-cycles each 5-motif has. Note that there are 34 non-isomorphic graphics on 5 vertices, as you can look up e.g. in OEIS. Notice also the division by 10, as before, to avoid overcounting.

cycle5_per_motif <- sapply(0:33, function (c) count_subgraph_isomorphisms(make_ring(5), graph_from_isomorphism_class(5, c, directed=F)) / 10)

With this information, we can run the motif finder and compute the final cycle count:

> sum(motifs(g, 5) * cycle5_per_motif, na.rm=T)
[1] 350

This is much faster than using count_subgraph_isomorphisms, but it only works up to 6-cycles at this moment.

Number of chordless cycles of length four in a graph in R

TL;DR: the answer is 0, because the graph is cordal.

The graph itself looks like this:

Graph

From the look of this graph, I'm not very optimistic we will find a chordless cycle of length four. And this can be quickly confirmed by this command:

is.chordal(g)

It returns TRUE, which means that this graph is chordal. In other words, "each of its cycles of four or more nodes has a chord".

I attempted anyway to enumerate all the chordless cycles of lenght 4. Since I don't know any smart way of doing it, I will do it with a few simpler steps:

Find all the simple paths from one node of the graph to any other;
Keep only the paths of length 4;
Check if the last node of this path is connected to the starting node, if it is, then keep this path;
Extract all the subgraphs corresponding to the paths I kept;
Check if they are chordal.

Each of these steps can be performed with a function from the igraphpackage.

res <- NULL
for (vi in V(g)) {
  pi <- all_simple_paths(g, from=vi, to = V(g))
  pi_4 <- pi[sapply(pi, length)==4]
  last_v <- sapply(pi_4, "[", 4)
  pi_4_c <- pi_4[sapply(last_v, function(v) are.connected(g, 1, v))]
  subgi <- lapply(pi_4_c, function(v) induced.subgraph(g, v))
  ci <- sapply(subgi, function(g) is_chordal(g)$chordal)
  res[[vi]] <- subgi[!ci]
}
res_with_dupl <- data.frame(t(sapply(res, V)))
unique(res_with_dupl)

Again, the result is that there is no chordless cycle of length 4 in this graph (res is empty).

I really look forward to reading the other answers!

R Igraph Find All Cycles