Archive for the 'Level: Intermediate' Category

A Turtle and a Rabbit to Find Loops in Linked Lists

Sunday, February 4th, 2007

One of the challenges that from time to time you may find is a program that has to deal with a linked list having a loop.

While this is not a very common situation, it can happen, desired or not, as result of a bug, the particular usage of the linked list in the program or corrupted data.

A classic example of such situation caused maliciously by viruses, is the case of directory loops in the old DOS. As in many other OSs, in DOS directories where nodes of a tree; each node, or directory, had a pointer to each of its sub-directories. A single path such as “C:\a\b\c\d\e” could be seen as a linked list a->b->c->d->e. This is not uncommon, but one thing was particularly bad in DOS 3.1. Nothing in the system was able to deal with directory loops, no even the most advanced utilities.

If you took a disk editor and connected a directory, say “e”, with anything on an upper level, say “b”, you’d cause serious issues. No utility at the time was able to catch such an issue, which was guaranteed to crash any directory crawling program, including CHKDSK. You could keep re-entering the loop ending up with a path like “C:\a\b\c\d\e\b\c\d\e\b\c\d\e”. Some old viruses took advantage of this weakness to create all sort of troubles.

Anyhow, how do you write an algorithm to find out if a list has a loop? Usually people tend to go for a solution like traversing the list, marking each node as they go and, if they found an already marked node, than there is a loop. That solution works, but once you are done you have to clear that mark on all the marked nodes. In some cases this might not be desirable or even possible.

I’d like to present a small cleaver algorithm that detects loops in a list without having to alter the list in any way. I like to call this algorithm “The Rabbit and The Turtle Algorithm”, but I wonder if it has another name (does it?)

The idea is simple: you use two runners (pointers) to traverse the list. One moves one step at the time (turtle), and one moves two steps at the time (rabbit). The two keep running until the rabbit finds the end of the list, or until the rabbit passes the turtle. If the rabbit arrives at the end of the list, there is obviously no loop. If the rabbit passes the turtle, than there is a loop.

Simple, isn’t it? Here a simple implementation in Java:

// ListNode is the type of a node.
// head is a ListNode, head of the list.
// If node is of type ListNode, node.next is
// the pointer to the next element in the list.

01: public boolean hasLoop() {
02:
03:    if ((head == null) ||
04:        (head.next == null)) return false;
05:
06:    ListNode lnT = head;        // Turtle
07:    ListNode lnR = head.next; // Rabbit
08:
09:    while ( true ) {
10:       if ( lnT == lnR )
               return true;
11:       if ( (lnR = lnR.next) == null )
               return false;
12:       lnT = lnT.next;
13:       if ( lnT == lnR )
               return true;
14:       if ( (lnR = lnR.next) == null )
               return false;
15:    }
16 }

Lines 03-04 deal with the cases of (1) an empty list and (2) a list with only one node. In either case there is no loop for sure.
Lines 06-07: talk about injustice! The rabbit starts in front of the turtle already.
Line 10: checks if the rabbit encountered the turtle. If it did, we found a loop.
Line 11: Moves the rabbit forward one step. If the rabbit reached the end of the list, no loop exists.
Line 12: Moves the turtle forward.
Line 13: If the rabbit found the turtle, we have a loop!
Line 14: Moves the rabbit an extra step forward. If the rabbit reached the end of the list, no loop exists.
Line 09: Repeats until one of the conditions above is satisfied.

Hope this is useful.

Nulling Pointers With memset

Wednesday, January 24th, 2007

Let’s say that you have the following structure:

struct
{
     char *sPtr1;
     char *sPtr2;
} S_something;

In C it is very tempting to initialize such a structure with something like:

   S_something s;
   memset(&s,0,sizeof(S_something));

That code seems OK at first, but there is a subtle issue. Can you spot it?

The problem is that this code is not always portable!! Most of the time you’ll be fine, however depending on the system (in rare cases, I admit) a null pointer might not be represented by a bitwise zero, but something else. In fact in some architectures 0 is a valid address (Example: IBM z/Architecture)

When you use memset to initialize the structure above, you are forcing the pointers in the structure to be set to bitwise zero. As noted that might not be equivalent to null, creating all sort of issues.

Conversely, when you assign a pointer to “0″ or to “null” on one of these systems, the compiler does the conversion automatically. memset doesn’t.

I’ll give you an example. The following code:

   S_something s;
   memset(&s,0,sizeof(S_something));
   if (s.sPtr1==0) printf("Good");
       else printf("Bad!");

Could potentially print “Bad!”.

Solution: simply initialize the structure by hand:

   S_something s;
   s.sPtr1 = 0;
   s.sPtr2 = 0;

This is guaranteed to work on all ANSI C compliant compilers.