How to Detect a Loop in a Linked List

How to detect a loop in a linked list?

You can make use of Floyd's cycle-finding algorithm, also known as tortoise and hare algorithm.

The idea is to have two references to the list and move them at different speeds. Move one forward by 1 node and the other by 2 nodes.

If the linked list has a loop they
will definitely meet.
Else either of
the two references(or their next)
will become null.

Java function implementing the algorithm:

boolean hasLoop(Node first) {

    if(first == null) // list does not exist..so no loop either
        return false;

    Node slow, fast; // create two references.

    slow = fast = first; // make both refer to the start of the list

    while(true) {

        slow = slow.next;          // 1 hop

        if(fast.next != null)
            fast = fast.next.next; // 2 hops
        else
            return false;          // next node null => no loop

        if(slow == null || fast == null) // if either hits null..no loop
            return false;

        if(slow == fast) // if the two ever meet...we must have a loop
            return true;
    }
}

Finding loop in a singly linked-list

You can detect it by simply running two pointers through the list, this process is known as the tortoise and hare algorithm after the fable of the same name:

First off, check if the list is empty (head is null). If so, no cycle exists, so stop now.
Otherwise, start the first pointer tortoise on the first node head, and the second pointer hare on the second node head.next.
Then loop continuously until hare is null (which may be already true in a one-element list), advancing tortoise by one and hare by two in each iteration. The hare is guaranteed to reach the end first (if there is an end) since it started ahead and runs faster.
If there is no end (i.e., if there is a cycle), they will eventually point to the same node and you can stop, knowing you have found a node somewhere within the cycle.

Consider the following loop which starts at 3:

head -> 1 -> 2 -> 3 -> 4 -> 5
                  ^         |
                  |         V
                  8 <- 7 <- 6

Starting tortoise at 1 and hare at 2, they take on the following values:

(tortoise,hare) = (1,2) (2,4) (3,6) (4,8) (5,4) (6,6)

Because they become equal at (6,6), and since hare should always be beyond tortoise in a non-looping list, it means you've discovered a cycle.

The pseudo-code will go something like this:

def hasLoop (head):
  return false if head = null           # Empty list has no loop.

  tortoise = head                       # tortoise initially first element.
  hare = tortoise.next                  # Set hare to second element.

  while hare != null:                   # Go until hare reaches end.
    return false if hare.next = null    # Check enough left for hare move.
    hare = hare.next.next               # Move hare forward two.

    tortoise = tortoise.next            # Move tortoise forward one.

    return true if hare = tortoise      # Same means loop found.
  endwhile

  return false                          # Loop exit means no loop.
enddef

The time complexity for this algorithm is O(n) since the number of nodes visited (by tortoise and hare) is proportional to the number of nodes.

Once you know a node within the loop, there's also an O(n) guaranteed method to find the start of the loop.

Let's return to the original position after you've found an element somewhere in the loop but you're not sure where the start of the loop is.

head -> 1 -> 2 -> 3 -> 4 -> 5
                  ^         |
                  |         V
                  8 <- 7 <- 6
                             \
                              x (where hare and tortoise met).

This is the process to follow:

Advance hare and set size to 1.
Then, as long as hare and tortoise are different, continue to advance hare, increasing size each time. This eventually gives the size of the cycle, six in this case.
At this point, if size is 1, that means you must already be at the start of the cycle (in a cycle of size one, there is only one possible node that can be in the cycle so it must be the first one). In this case, you simply return hare as the start, and skip the rest of the steps below.
Otherwise, set both hare and tortoise to the first element of the list and advance hare exactly size times (to the 7 in this case). This gives two pointers that are different by exactly the size of the cycle.
Then, as long as hare and tortoise are different, advance them both together (with the hare running at a more sedate pace, the same speed as the tortoise - I guess it's tired from its first run). Since they will remain exactly size elements apart from each other at all times, tortoise will reach the start of the cycle at exactly the same time as hare returns to the start of the cycle.

You can see that with the following walkthrough:

size  tortoise  hare  comment
----  --------  ----  -------
   6         1     1  initial state
                   7  advance hare by six
             2     8  1/7 different, so advance both together
             3     3  2/8 different, so advance both together
                      3/3 same, so exit loop

Hence 3 is the start point of the cycle and, since both those operations (the cycle detection and cycle start discovery) are O(n) and performed sequentially, the whole thing taken together is also O(n).

If you want a more formal proof that this works, you can examine the following resources:

a question on our sister site;
the Wikipedia cycle detection page; or
"The Tortoise and the Hare Algorithm" by Peter Gammie, April 17, 2016.

If you're simply after support for the method (not formal proof), you can run the following Python 3 program which evaluates its workability for a large number of sizes (how many elements in the cycle) and lead-ins (elements before the cycle start).

You'll find it always finds a point where the two pointers meet:

def nextp(p, ld, sz):
    if p == ld + sz:
        return ld
    return p + 1

for size in range(1,1001):
    for lead in range(1001):
        p1 = 0
        p2 = 0
        while True:
            p1 = nextp(p1, lead, size)
            p2 = nextp(nextp(p2, lead, size), lead, size)
            if p1 == p2:
                print("sz = %d, ld = %d, found = %d" % (size, lead, p1))
                break

How to find the node at the beginning of cycle in linked list?

Unless the book specifically assumes the existence of a positive such k, k can also be zero.

Here's an informal argument for the case k=0:

Assume that the entire list is a loop.

If the fast runner is twice as fast as the slow runner, they will collide when the slow runner has completed one loop, and the fast runner has completed two.

If you then start the slow runner from the beginning again, the two will collide immediately on the first node.

Finding loop in linked list complexity?

Since fast pointer is moving at double speed than slow, therefore distance between the two pointers will always increase by 1 (initially the distance between them was 2).

Now assume that loop exists and when slow pointer entered the loop, distance between slow and fast was say "x" and the length of the loop is "d". So now the next time slow and fast will move, distance between them will become x+1 and after that on next move it will be x+2, then x+3 and so on. Fast and slow will meet whenever the distance between them will be a multiple of d. So by increasing distance between them by 1 we are checking at every value.

Now consider the case when fast is moving three times faster, then at each step distance between them will increase by 2 i.e x, x+2, x+4 and so on. So the two pointers might not meet each other and cross each other and if that happens your program will never terminate.
Similar is the case if speed of fast pointer is four times, five times etc.

Detecting start of a loop in two intersecting singly Linked Lists

The problem is answered with proper explanation here. The only difference with your requirement is, you have two singly linked lists with a loop and the solution discussed here only considered a singly linked list. But, two linked list actually do not have any impact here.

Although you asked for a very specific case (e.g., two intersecting singly Linked Lists), but I would like to make a more general one. So, with two singly linked list based on whether they intersect or not, and whether they form cycle or not, you can have the following cases:

Case-1: Two linked lists have separate chain, they never meet at any common place, and they have two cycles in their respective chains.
Case-2: Two linked lists have separate chain, they never meet at any common place, and they have either no-cycle in their respective chains or one of them have a cycle.
Case-3: Two linked lists intersect, and they have formed a cycle (your example case).
Case-4: Two linked lists intersect, and they do not form any cycle.

Now consider you have a function where you pass a pointer as a function parameter representing the head of a singly linked. This function will run Floyd's cycle detection algorithm to check whether the singly linked list form any cycle. If it found any cycle, it always return the starting pointer of the cycle. Otherwise, return a null pointer. You can check @CEGRD's answer from here to know the details about how you can write this function.

After that, you can simply call this function for every singly linked list separately and you will get two starting nodes' pointer of cycle (if exist, otherwise you will get null pointer). Now, if these two returned nodes' pointer are same (and make sure both of them are not null-pointer), means a cycle exist for the combination of this two linked lists. The process should be like this:

ListNode *detectCycle(ListNode *head) {
    ListNode *walker = head;
    ListNode *runner = head;

    while(runner != nullptr && runner->next != nullptr) {
        walker = walker->next;
        runner = runner->next->next;

        if(walker == runner) {
            ListNode *seeker = head;

            while(seeker != walker) {
                seeker = seeker->next;
                walker = walker->next;
            }
            return walker;
        }
    }

    return nullptr;
}

// headA and headB are the head pointer of cycle A and B respectively.
ListNode *detectCycleInTwoLinkedList(ListNode *headA, ListNode *headB) {
    ListNode * ca = detectCycle(headA)
    ListNode * cb = detectCycle(headB)

    if(ca == nullptr || cb == nullptr) return nullptr;
    return (ca == cb) ? ca : nullptr;
}

Although I didn't actually run this code for any case, but I hope it will work without any error. You can try for the different cases listed above and comment here.

How to find loop-node of a linkedlist with a loop?

There is a simple approach using two pointers.First pointer increments by one and second by twice the speed of slow pointer.

So linked list in your case is actually A->B->C->D->E->F->C meaning F points back again to C.So approach is like below

1.Keep on incrementing the two pointers till they match.So in above case we would have these set of steps

Slow Pointer: A B C D E

Fast Pointer: A C E C E

So we stop at E and which indicates that there is a loop.Now we need to find the loop node.

Now from E move the slow pointer to the beginning of the linked list and create a new pointer which points at E and also increments by 1.The point at which these two pointer meet is actually the loop node.So in our case

Pointer From the Beginning: A B C New Pointer: E F C

So as you see they meet at C and we are done with finding the loop node in a linked list.

Update:
For the Mathematical proof of this approach please refer this wonderful question and see @Jim Lewis answer alongwith all the comments beneath the answer. Explain how finding cycle start node in cycle linked list work?

Follow up on detecting loop in linked list

It's because to

mark the nodes as visited

you need either extra space in the nodes themselves to do it, or an auxiliary data structure whose size will increase with that of the list, whereas the two-pointer solutions require only enough extra space for one more pointer.

[EDITED to add:] ... And, perhaps, also because the two-pointer solutions are clever and people like clever solutions to things.

How to Detect a Loop in a Linked List