Recursion Patterns
| Contents |

You can download the functions defined or used in this chapter with the following commands:

wget troll.cs.ua.edu/ACP-C/recursionPatterns.c wget troll.cs.ua.edu/ACP-C/recursionPatterns.h

These files will help you run the test code listed in this chapter.
You will also need to download the integer *list* module.

Recursion, arrays, and lists go hand-in-hand. What follows
are a number of recursive patterns involving arrays and lists
that you should be able to recognize and implement.
Recursion for lists makes heavy use of the *head* and *tail* functions
from the previous chapter.
As all the examples in this chapter use lists of integer values.
We will also make use of
the following analogs for arrays:

list expression | equivalent array expression |

`head(items)` | `items[0]` |

`tail(items)` | `items+1` |

`items == 0` | `size == 0` |

The last expression tests whether the list (or array) is "empty".
Note that an array name points to the first slot in the array.
Suppose the name of the array is *items*. Then
*items* + 1 points to the second slot in the array.
If the size of the array *items* is *n*, the the size of
the array pointed to by *items* + 1 must be *n - 1*.
This is why adding one to the array name is equivalent to
taking the tail of a list.

The *counting* pattern is used to count the number of items
in a collection. If the collection is empty, then its count of items
is zero.
The following function
counts and ultimately returns the number of items in an array:

int countList(Node *items) { if (items == 0) //base case return 0; else return 1 + countList(tail(items)); }

The functions works on the observation that if you count the number of items in the tail of a list, then the number of items in the entire array is one plus that number. The extra one added in accounts for the head item that omitted when the tail was counted.

The array version is similar:

int countArray(int *items,int size) { if (size == 0) //base case return 0; else return 1 + countArray(items+1,size-1); }

As with using a loop, the counting pattern for arrays is a bit useless as we need to the the size of the array before we can determine its size. However, the pattern is a basis for all other recursive patterns for arrays:

- recursive functions on an array need the size of the array
- the base case tests that the size has reached zero
- taking the "tail" of the array means the size needs to be reduced by one

The *accumulate* pattern is used to combine all the values in
a collection.
The following function performs a summation of the list values:

int sumList(Node *items) { if (items == 0) //base case return 0; else return head(items) + sumList(tail(items)); }

Recall that the *head* function calls
*getNodeValue* on the node at the front of a list of items.
Note that the only difference between the *count* function and
the *sum* function is the recursive case adds in the value of
the head item, rather than just counting the head item.
That the functions *countList* and *sumList* look similar
is no coincidence. In fact, most recursive functions, especially
those working on collections, look very
similar to one another.

The equivalent *sumArray* is left as an exercise.

A variation on the *counting* and *accumulate*
patterns involves *filtering*. When filtering,
we use an additional `if`

statement to decide whether
or not we should count the item, or in the case of accumulating,
whether or not the item ends up in the accumulation.

Suppose we wish to count the number of even items in a list:

int countListEvens(Node *items) { if (items == 0) //base case return 0; else if (head(items) % 2 == 0) return 1 + countListEvens(tail(items)); else return 0 + countListEvens(tail(items)); }

The base case states that there are zero even numbers in an empty list. The first recursive case simply counts the head item if it is even and so adds 1 to the count of even items in the remainder of the list. The second recursive case does not count the head item (because it is not even) and so adds in a 0 to the count of the remaining items. Of course, the last return would almost always be written as:

return countListEvens(tail(items));

The array version is similar:

int countArrayEvens(int *items,int size) { if (size == 0) //base case return 0; else if (items[0] % 2 == 0) //head item always at index 0 return 1 + countArrayEvens(items+1,size-1); else return countArrayEvens(items+1,size-1); }

As another example of *filtered counting*, we can
pass in a value and count how many times that
value occurs:

int occurrencesInList(int target,Node *items) { if (items == 0) return 0; else if (head(items) == target) return 1 + occurrencesInList(target,tail(items)); else return occurrencesInList(target,tail(items)); }

An example of a *filtered-accumulation* would be
to sum the even-numbered integers in an array:

int sumArrayEvens(int *items,int size) { if (size == 0) return 0; else if (isEven(items[0])) return items[0] + sumArrayEvens(items+1,size-1); else return sumArrayEvens(items+1,size-1); }

The extreme and extreme index patterns are a bit more difficult to implement recursively, due to the need to carry more information around while making recursive calls. For example, here is an example of the extreme pattern for arrays. We use a helper function that carries the extra information around:

int getMax(int *a,int size) { return getMaxHelper(a,size,a[0],1); //start looking at index 1 }

The third argument of the helper function is the assumed largest value. The fourth argument is the index at which one starts looking for a better max. Here is the helper function:

int getMaxHelper(int *a,int size,int best,int i) { if (i == s) //no more indices to check return best; else if (a[i] > best) //found a bigger number! return getMaxHelper(a,s,a[i],i+1) else //best is still best return getMaxHelper(a,s,best,i+1) }

The first recursive case finds a better "best" number at the
current index *i*, so it
replaces the old *best* with the better candidate in the recursive
call. The second recursive case doesn't replace *best*. In both
recursive cases, the index is incremented so that subsequent
elements can be checked.

The extreme index patter for arrays is similar:

int getMaxIndex(int *a,int size) { return getMaxIndexHelper(a,size,0,1); //start looking at index 1 }

In the case of *getMaxIndex*, the index of the best candidate so far
is passed around, rather than its value. The implementation of
*getMaxIndexHelper* is left as an exercise.

A similar strategy is used to find extreme values in a linked list.

A special case of a filtered-accumulation is called *filter*.
Instead of summing the filtered items (for example), we collect
the filtered items into a collection. The new collection is said to be
a *reduction* of the original collection.

Suppose we wish to extract the even numbers from a list. The
structure of the
code looks very much like the *sumEvens* function in the previous
section, but instead of adding in the desired item,
we join the item to the
reduction of the tail of the array:

Node * extractListEvens(Node *items) { if (items == 0) return 0; else if (isEven(head(items))) return join(head(items),extractListEvens(tail(items))); else return extractListEvens(tail(items)); }

Given a list of integers,
*extractEvens* returns a (possibly empty)
list of the even numbers:

//test (compile with ilist.c and recursionPatterns.c) #include "ilist.h" #include "recursionPatternsPatterns.h" int numbers[] = {4,2,5,2,7,0,8,3,7}; Node *a; a = arrayToList(numbers,sizeof(numbers)/sizeof(int)); displayList(extractEvens(a));

Running this code yields:

{4}{2}{2}{0}{8}

Recall that, in the previous chapter, our attempt at filtering with a loop left the collected items in the opposite order. A recursive implementation preserves the order of values found in the original list.

With *extractEvens* defined,
then *sumListEvens* could be written very simply as:

int sumListEvens(Node *items) { return sumList(extractListEvens(items)); }

Unlike many patterns, recursive filtering is best suited for lists; recursive filtering of arrays is very problematic and will not be dealt with here.

*Mapping* is a task closely coupled with that
of reduction, but rather than collecting
certain items, as with the *filter* pattern, we
collect all the items. As we collect, however,
we transform each item as
we collect it. The basic pattern looks like this:

Node * mapList(int (*f)(int),Node *items) { if (items == 0) return 0; else { int v = f(head(items)); return join(v,mapList(f,tail(items))); } }

Here, *f* is used to transform each item in the
recursive step.
As with mapping using a loop, we use a function pointer as the
type of the formal parameter *f*.

Suppose we wish to subtract one from each element in an array. First we need a transforming function that reduces its argument by one:

int decrement(int x) { return x - 1; }

Now we can "map" the *decrement* function over an array of numbers:

//test (compile with ilist.c and recursionPatterns.c) #include "ilist.h" #include "recursionPatterns.h" int numbers[] = {4,3,7,2,4,3,1}; Node *a = arrayToList(numbers,sizeof(numbers) / sizeof(int)); Node *b = mapList(decrement,a); displayList(b);

The code yields the following output:

{3}{2}{6}{1}{3}{2}{0}

We see that each value in the resulting list is one less than
the corresponding number in the original *numbers* array.

Both map and filtering (reduce) are used heavily by Google in programming their search strategies.

The *search* pattern is a slight variation of *filtered-counting*.
Suppose we wish to see if a value is present in a list. We can
use a filtered-counting approach and if the count is greater than
zero, we know that the item was indeed in the list:

int findInList(int target,Node *items) { return occurrencesInList(target,items) > 0; }

In this case, *occurrencesInList*
helps *find* do its job. We call
such functions, naturally, *helper functions*.
We can improve the efficiency of *find* by having it
perform the search, but short-circuiting the
search once the target item is found. We do this
by turning the first recursive case into a second base case:

int findInList2(int target,Node *items) { if (items == 0) //empty list, not found return 0; else if (head(items) == target) //short-circuit! return 1; else return findInList2(target,tail(items)); }

When the list is empty, we return false because if the item had been list, we would have hit the second base case (and returned true) before hitting the first. If neither base case hits, we simple search the remainder of the list (the recursive case). If the second base case never hits, the first base case eventually will.

Sometimes, we wish to combine two lists.
This is easy to do with the *append* function:

append(list1,list2)

This places the first element in the second list after the last
element in the first list.
However, many times we wish to intersperse the elements from the
first list with the elements in the second list. This is known
as a *shuffle*, so named since it is similar to shuffling a deck of
cards. When a deck of cards is shuffled,
the deck is divided in two halves (one half is akin to the first
list and the other half is akin to the second list). Next the
two halves are interleaved back into a single deck (akin to
the resulting third list).
Note that while appending is destructive, in that it changes
the first list, shuffling is non-destructive, in the neither
of the shuffled lists are modified.

We can use recursion to shuffle two lists. If both lists are exactly
the same length, the recursive function is easy to implement
using the *accumulate* pattern:

Node * shuffle(Node *list1,Node *list2) { if (list1 == 0) return 0; else { Node *rest = shuffle(tail(list1),tail(list2)); return join(head(list1),join(head(list2),rest)); } }

If *list1* is empty
(which means *list2* is empty since they have the same number of elements),
the function returns the empty, since shuffling nothing together
yields nothing.
Otherwise, we shuffle the tails of the lists (the result is stored
in the temporary variable *rest*), then join the
the first elements of each list to the shuffled remainders.

If you have ever shuffled a deck of cards, you will know that it is
rare for the deck to be split exactly in half prior to the shuffle.
Can we amend our shuffle function to deal with this problem? Indeed,
we can,
by simply placing the extra cards (list items) at the end
of the shuffle. We don't know which list (*list1* or *list2*)
will go empty first,
so we test for each list becoming empty in turn:

Node * shuffle2(Node *list1,Node *list2) { if (list1 == 0) return list2; else if (list2 == 0) return list1; else { Node *rest = shuffle2(tail(list1),tail(list2)); return join(head(list1),join(head(list2),rest)); } }

If either list is empty, we return the other. Only if both are not empty do we execute the recursive case.

One can make shuffling even simpler by manipulating the recursive call that shuffles the remainder of the lists. If we flip the order of the lists in the recursive call, we only need to deal with the first list becoming empty:

Node * shuffle3(Node *list1,Node *list2) { if (list1 == 0) return list2; else return join(head(list1),shuffle3(list2,tail(list1))); }

Note that in the recursive call, we take the tail of *list1* since
we joined the head of *list1* to the resulting shuffle.
The base case returns *list2* because if *list1* is empty,
the "shuffle" of
the remaining elements is just *list2*.
Also, even if *list1* and *list2* are both empty,
the base case test
returns the correct value of zero, the null pointer or empty list.

Finally, note how much simpler it is to shuffle two lists using recursion as compared to shuffling two arrays with loops.

With the *shuffle* pattern, we always took the head elements from both
lists at each step in the shuffling process.
Sometimes, we wish to place a constraint of the choice of elements.
For example, suppose the two lists to be combined are sorted
and we wish the resulting list to be sorted as well. The
following example shows that shuffling does not always work:

//test (compile with ilist.c and recursionPatterns.c) #include "ilist.h" #include "recursionPatterns.h" int n1[6] = {1,4,6,7,8,11}; int n2[4] = {2,3,5,9}; Node *c = shuffle3(arrayToList(n1,6),arrayToList(n2,4)); displayList(c);

The result is shuffled, but not sorted:

{1}{2}{4}{3}{6}{5}{7}{9}{8}{11}

The *merge* pattern is used to ensure the resulting list is
sorted and is based upon the *filtered-accumulate* pattern.
We only accumulate an item *if* it is the smallest item
in the two lists:

Node * mergeList(Node *list1,Node *list2) { if (list1 == 0) //list1 is empty, return list2 return list2; else if (list2 == 0) //list2 is empty, return list1 return list1; else if (head(list1) < head(list2)) return join(head(list1),mergeList(tail(list1),list2)); else //the head of list2 is smaller return join(head(list2),mergeList(list1,tail(list2))); }

As with *shuffle2*, we don't know which list will become empty
first, so we check both in turn.

In the first recursive case, the first element of the first list is smaller than the first element of the second list. So we accumulate the first element of the first list and recur, sending the tail of the first list because we have used/accumulated the head of that list. The second list we pass unmodified, since we did not use/accumulate an element from the second list.

In the second recursive case, we implement the symmetric version of the first recursive case, focusing on the second list rather than the first.

Testing our function:

//test (compile with ilist.c and recursionPatterns.c) #include "ilist.h" #include "recursionPatterns.h" int n1[6] = {1,4,6,7,8,11}; int n2[4] = {2,3,5,9}; Node *c = mergeList(arrayToList(n1,6),arrayToList(n2,4)); displayList(c);

yields:

{1}{2}{3}{4}{5}{6}{7}{8}{9}{11}

If a recursive function mistakenly never makes the problem
smaller, the problems is said to be *fossilized*.
Without ever smaller problems, the base case is never reached
and the function recurs^{48}
forever.
This condition is known as an
*infinite recursive loop*. Here is an example:

int countList(Node *items) { if (n == 0) return 0; else return 1 + countList(items); //should be tail(items) }

Since *countList* recurs on the same list it was given,
*items*
never gets smaller so it never becomes empty.
Fossilizing the problem is a common error made by both novice
and expert programmers alike.

Related to the *fossilized* pattern is the *bottomless* pattern.
With the *bottomless* pattern, the problem gets smaller, but the base
case is never reached. Here is a function that attempts to
divide a positive number by two, by seeing how many times
you can subtract two from the number:^{49}

int div2(int n) { if (n == 0) return 0; else return 1 + div2(n - 2); }

Things work great for a while:

//test (compile with ilist.c and recursionPatterns.c) #include "ilist.h" #include "recursionPatterns.h" printf("div2(16) is %d\n",div2(16));

yields:

div2(16) is 8

and:

printf("div2(134) is %d\n",div2(134));

yields:

div2(134) is 67

But then, something goes terribly wrong:

printf("div2(7) is %d\n",div2(7));

results in the program crashing:

Segmentation fault (core dumped)

What happened? To see, let's *visualize* our function,
as we did with the *gcd* function previously,
by adding a *print* statement:

int div2(int n) { printf("div2(%d)...\n",n); if (n == 0) return 0; else return 1 + div2(n - 2); }

Now every time the function is called, both originally and
recursively, we can see how the value of *n* is changing:

div2(7)... div2(5)... div2(3)... div2(1)... div2(-1)... div2(-3)... ... Segmentation fault (core dumped)

Now we can see why things went wrong, the value of *n*
skipped over the value of zero and just kept on going.
The solution is to change the base case to catch odd
(and even) numbers:

int div2(int n) { if (n <= 1) return 0; else return 1 + div2(n - 2); }

Remember, when you see a recursion depth exceeded error, you likely have implemented either the fossilized or the bottomless pattern.

Recursion Patterns
| Contents |