Contents |

In
the chapter on conditionals,
we learned about *if* statements.
When we combine *if* statements with functions that call
themselves, we obtain a powerful programming paradigm
called *recursion*.

Recursion is a form of looping; when we loop,
we evaluate code over and over again. We use a conditional
to decide whether to continue the loop or to stop.
Recursive loops are often
easier to write and understand, as compared to the iterative loops
such as *while*s and *for*s, which you learned
about in a previous chapter.
In some programming
languages, iterative loops are preferred as they use much
less computer memory and are slightly faster as compared to recursive loops.
In other languages, this is not the case at all.
In general, there
is no reason to prefer iterative loops over recursive ones,
other than this memory issue and slight loss of performance.
Any iterative loop can be written as a recursion and any recursion
can be written as an iterative loop.
Use a recursion if that makes the implementation
more clear, otherwise, use an iterative loop.

Many mathematical functions are easy to implement recursively, so
we will start there. Recall that the factorial of a number
*n* is:

[Equation 1] *n*! = *n* * (*n* - 1) * (*n* - 2) * ... * 2 * 1

Consider writing a function which computes the factorial of a positive integer. For example, if the function were passed the value of 4, it should return the value of 24 since 4! is 4*3*2*1 or 24. To apply recursion to solve this problem or any problem, for that matter, it must be possible to state the solution of a problem so that it references a simpler version of the problem. For factorial, the factorial of a number can be stated in terms of a simpler factorial. Consider the two equations below:

[Equation 2] *n*! = 1 if *n* == 0

[Equation 3] *n*! = *n* * (*n* - 1)! if *n* > 0

Equation
2
says that the factorial
of zero is one^{38}.
Equation
3
states that the factorial
of any other (positive) number is obtained by
multiplying the number by the factorial of one less than that
number. Together, these two equations form a
*recurrence equation* that describes the computation
of factorial.

After some study, you should be able to see that this
new way of describing factorial is equivalent to
Equation
1,
the one that that used
ellipses^{39}.
Equation
1,
gets the basic idea of factorial
across but is not very precise. For example, how would you compute
the factorial of three using Equation
1?

The second form with the two equations is particularly well suited for implementation as a function in a computer program:

def factorial(n): if (n == 0): return 1 else: return n * factorial(n - 1)

Note how the *factorial* function
precisely implements the recurrence equation.
Convince yourself that the function really works by tracing the function call:

factorial(3)

Decomposing the call, we find that:

factorial(3) is 3 * factorial(2)

since *n*, having a value of 3, is not equal to 0.
Thus the second block of the *if* is evaluated and
we can replace *n* with 3 and *n-1* with 2.
Likewise, we can replace `factorial(2)`

by `2 * factorial(1)`

, yielding:

factorial(3) is 3 * 2 * factorial(1)

since *n*, now having a value of 2, is still not zero.
Continuing along this vein, we can replace
`factorial(1)`

by `1 * factorial(0)`

, yielding:

factorial(3) is 3 * 2 * 1 * factorial(0)

Now in this last call to factorial,
*n* does have a value of zero, so we can replace
`factorial(0)`

with its immediate return value of one:

factorial(3) is 3 * 2 * 1 * 1

Thus, `factorial(3)`

has a value of six:

>>> factorial(3) 6

as expected.

In contrast, here is a function which computes factorial using
a *for* loop:

def factorial(n): total = 1; for i in range(1,n+1): total = total * i return total

We can see from this version that factorial is an accumulation. This
version
also more closely follows
Equation
1.
Note that we have to extend the upper end of the range by one in order to
get *n* included in the accumulation,
since the upper limit of the *range* function is exclusive.

Recursive approaches rely on the fact that it is
usually simpler to solve a smaller problem than a
larger one. In the factorial problem, trying to
find the factorial of *n - 1* is a little bit simpler^{40}
than finding the factorial of *n*.
If finding the
factorial of *n - 1* is still too hard to solve easily,
then find the factorial of *n - 2* and so on until
we find a case where the solution is dead easy.
With regards to factorial, this is when *n* is equal to
zero. The *easy-to-solve* code (and the values that get
you there) is known as the *base* case. The
*find-the-solution-using-a-simpler-problem* code (and the values
that get you there) is known as the *recursive* case.
The recursive case usually contains a call to the very
function being executed. This call is known as a
*recursive* call.

Most well-formed recursive functions are composed of
at least one *base* case and at least one *recursive* case.

Consider finding the greatest common divisor, the *gcd*, of two numbers.
For example, the *gcd* of 30 and 70 is 10, since 10 is the largest
number that divides both 30 and 70 evenly.
The ancient Greek philosopher Euclid devised a solution for
this problem that involves
repeated division.
The first division divides the two
numbers in question, saving the remainder. Now the divisor becomes
the dividend and the
remainder becomes the divisor. This process is repeated until
the remainder becomes zero.
At that
point, the current divisor is the *gcd*.
We can specify this as a recurrence equation, with this last bit about
the remainder becoming zero as our base case:

gcd(a,b) | is | b | if a divided by b has a remainder of zero |

gcd(a,b) | is | gcd(b,a % b) | otherwise |

In this recurrence equation,
*a* and *b* are the dividend and the divisor, respectively.
Recall that the modulus operator `%`

returns the remainder.
Using the recurrence equation as a guide, we can easily implement
a function for computing the *gcd* of two numbers.

def gcd(dividend,divisor): if (dividend % divisor == 0): return divisor else: return gcd(divisor,dividend % divisor)

Note that in our implementation of *gcd*, we used more descriptive
variables than *a* and *b*.
We can improve this function further, by noting that
the remainder is computed twice, once for the
base case and once again to make the recursive call.
Rather than compute it twice, we compute it straight off
and save it in an aptly named variable:

def gcd(dividend,divisor): remainder = dividend % divisor if (remainder == 0): return divisor else: return gcd(divisor,remainder)

Look at how the recursive version turns the *divisor* into
the *dividend*
by passing *divisor* as the first argument in the recursive
call.
By the same token, *remainder* becomes *divisor* by nature
of being the second argument in the recursive call.
To convince one's self that the routine really works,
one can modify *gcd* to "visualize" the arguments. On simple
way of visualizing the action of a function is to
add a print statement:

def gcd(dividend,divisor): remainder = dividend % divisor print("gcd:",dividend,divisor,remainder) if (remainder == 0): return divisor else: return gcd(divisor,remainder)

After doing so, we get the following output:

>>> gcd(66,42) gcd: 66 42 24 gcd: 42 24 18 gcd: 24 18 6 gcd: 18 6 0 6

Note, how the first remainder, 24, keeps shifting
to the left.
In the first recursive call, the remainder becomes
*divisor*, so the 24 shifts one spot to the left. On
the second recursive call, the current *divisor*,
which is 24,
becomes the *dividend*,
so the 24 shifts once again to
the left.

We can also write a iterative version of *gcd*:

def gcd(dividend,divisor): while (divisor != 0): print(divisor,dividend) temp = dividend % divisor dividend = divisor divisor = temp return dividend

While the iterative version of factorial was only slightly more
complicated than the recursive version, with *gcd*, we start to
see more of an advantage using the recursive formulation. For instance,
where did the *temp* variable come from and why is it necessary^{41}?

A third example of recursion is the computation of the
*n ^{th}* Fibonacci number.
The Fibonacci series looks like this:

n : 0 1 2 3 4 5 6 7 8 ... Fibonacci(n) : 0 1 1 2 3 5 8 13 21 ...

and is found in nature again and again^{42}.
From this table, we can see that the *7 ^{th}* Fibonacci number is
13.
In general, a Fibonacci number is equal to the sum of the previous
two Fibonacci numbers. The exceptions are the zeroth and the first
Fibonacci numbers which are equal to 0 and 1 respectively. Voila! The
recurrence case and the two base cases have jumped right out at us! Here,
then, is a recurrence equation which describes the computation of the

fib(n) | is | 0 | if n is zero |

fib(n) | is | 1 | if n is one |

fib(n) | is | fib(n - 1) + fib(n - 2) | otherwise |

Again, it is straightforward to convert the recurrence equation into a working function:

# compute the nth Fibonacci number # n must be non-negative! def fibonacci(n): if (n == 0): return 0 elif (n == 1): return 1 else: return fibonacci(n-1) + fibonacci(n-2)

Our implementation is straightforward and elegant. Unfortunately, it's
horribly inefficient in Python. Unlike our recursive version of
*factorial* which
recurred about as many times as the size of the number sent to the function,
our recursive version of Fibonacci
will recur many, many more times than the size of
its input. Here's why.

Consider the call to `fib(6)`

.
Tracing all the recursive calls to *fib*, we get:

fib(6) is fib(5) + fib(4)

Replacing `fib(5)`

with `fib(4) + fib(3)`

,
we get:

fib(6) is fib(4) + fib(3) + fib(4)

We can already see a problem, we will compute `fib(4)`

twice,
once from the original call to `fib(6)`

and again when we
try to find `fib(5)`

.
If we write down all the recursive calls generated by `fib(6)`

with each recursive call indented from the previous, we
get a structure that looks like this:

fib(6) fib(5) fib(4) fib(3) fib(2) fib(1) fib(0) fib(1) fib(2) fib(1) fib(0) fib(3) fib(2) fib(1) fib(0) fib(1) fib(4) fib(3) fib(2) fib(1) fib(0) fib(1) fib(2) fib(1) fib(0)

We would expect, based on how the Fibonacci sequence is
generated,
to take about six "steps" to calculate `fib(6)`

.
Instead,
ultimately there were 13 calls^{43}
to either
`fib(1)`

or `fib(0)`

.
There was a tremendous amount of duplicated
and, therefore, wasted effort. An important part
of Computer Science is how to reduce the wasted effort.
One way to keep the recursive nature without
the penalty of redundant computations
is to *cache* previously computed values.
Another way is to use an iterative loop:

def fib(n): previous = 0 current = 1 for i in range(0,n,1): temp = previous previous = current current = previous + temp return previous

Here, the recursive version, for all its faults, is much, much easier
to understand. Complications of the iterative version include, why
is *previous* returned and not *current*?
And where did the variable *temp*
come from?
In the
chapter that compares recursion to loops, we will
see a way to combine the clarity of recursion with the
efficieny of loops.

Recursion, arrays, and lists go hand-in-hand. What follows are a number of recursive patterns involving arrays and lists that you should be able to recognize and implement.

For the following discussion, assume the
*ahead* function returns the first item of the given array,
while the
*atail*
function returns an array composed of all the items
in the given array except for the first element.
If the array is empty, it will have a value of
`[]`

.

By the way, the *ahead* and *atail* functions are easily implemented
in Python:

def ahead(items): return items[0] def atail(items): return items[1:] #slicing, which copies!

Note that the *ahead* and *atail* functions are analogous to the
*ListHead* and *ListTail* functions of the previous chapter on lists.
Note also that we will use the term *collection* to stand in
for either a list or an array.

The *counting* pattern is used to count the number of items
in a collection. If the collection is empty, then its count of items
is zero.
The following function
counts and ultimately returns the number of items in an array:

def count(items): if (items == []): # base case return 0 else: return 1 + count(atail(items))

The functions works on the observation that if you count the number of items in the tail of a array, then the number of items in the entire array is one plus that number. The extra one accounts for the head item that was not counted when the tail was counted.

The list version is similar:

def count(items): if (items == None): # base case return 0 else: return 1 + count(tail(items))

The only code difference in the list version are the use of
`None`

instead of `[]`

to signify empty and
*tail* instead of *atail*. However,
the performance difference between the two versions can be huge!
For a list or an
array with *n* items, the list version will run *n* times faster.
So if your collection holds a million items, the array
version will work on the order of one million times slower.
This is due to how *atail* works versus how *tail* works,
as discussed in
the previous chapter.
For small numbers of items, however, the array version
will run quickly enough.

The *accumulate* pattern is used to combine all the values in
a collection.
The following function performs a summation of the list values:

def sum(items): if (items == None): #base case return 0 else: return head(items) + sum(tail(items))

Note that the only difference between the *count* function and
the *sum* function is the recursive case adds in the value of
the head item, rather than just counting the head item.
That the functions *count* and *sum* look similar
is no coincidence. In fact, most recursive functions, especially
those working on collections, look very
similar to one another.

A variation on the *counting* and *accumulate*
patterns involves *filtering*. When filtering,
we use an additional `if`

statement to decide whether
or not we should count the item, or in the case of accumulating,
whether or not the item ends up in the accumulation.

Suppose we wish to count the number of even items in a list:

def countEvens(items): if (items == None): #base case return 0 elif (head(items) % 2 == 0): return 1 + countEvens(tail(items)) else: return 0 + countEvens(tail(items))

The base case states that there are zero even numbers in an empty list. The first recursive case simply counts the head item if it is even and so adds 1 to the count of even items in the remainder of the list. The second recursive case does not count the head item (because it is not even) and so adds in a 0 to the count of the remaining items. Of course, the last return would almost always be written as:

return countEvens(tail(items))

As another example of *filtered counting*, we can
pass in a value and count how many times that
value occurs:

def occurrences(target,items): if (items == None): return 0 elif (head(items) == target): return 1 + occurrences(target,tail(items)) else: return occurrences(target,tail(items))

An example of a *filtered-accumulation* would be
to sum the even-numbered integers in a list:

def sumEvens(items): if (items == None): return 0 elif (isEven(head(items))): return head(items) + sumEvens(tail(items)) else: return sumEvens(tail(items))

where the *isEven* function is defined as:

def isEven(x): return x % 2 == 0

A special case of a filtered-accumulation is called *filter*.
Instead of summing the filtered items (for example), we collect
the filtered items into a collection. The new collection is said to be
a *reduction* of the original collection.

Suppose we wish to extract the even numbers from a list. The
structure of the
code looks very much like the *sumEvens* function in the previous
section, but instead of adding in the desired item,
we join the item to the
reduction of the tail of the array:

def extractEvens(items): if (items == None): return None elif (isEven(head(items))): return join(head(items),extractEvens(tail(items))) else: return extractEvens(tail(items))

Given a list of integers, *extractEvens* returns a (possibly empty)
list of the even numbers:

>>> a = ListCreate(4,2,5,2,7,0,8,3,7) >>> extractEvens(a) [4, [2, [2, [0, [8, None]]]]] >>> b = ListCreate(1,3,5,7,9) >>> extractEvens(b) []

We see again that our list representation depends on arrays.

*Mapping* is a task closely coupled with that
of reduction, but rather than collecting
certain items, as with the *filter* pattern, we
collect all the items. As we collect, however,
we transform each item as
we collect it. The basic pattern looks like this:

def map(f,items): if (items == None): return None else: return join(f(head(items)),map(f,tail(items)))

Here, function *f* is used to transform each item in the
recursive step.

Suppose we wish to subtract one from each element in an array. First we need a transforming function that reduces its argument by one:

def decrement(x): return x - 1

Now we can "map" the *decrement* function over an array of numbers:

>>> a = ListCreate(4,3,7,2,4,3,1) >>> map(decrement,a) [3, [2, [6, [1, [3, [2, [0, None]]]]]]]

Both map and filtering (reduce) are used heavily by Google in programming their search strategies.

The *search* pattern is a slight variation of *filtered-counting*.
Suppose we wish to see if a value is present in a list. We can
use a filtered-counting approach and if the count is greater than
zero, we know that the item was indeed in the list.

def find(target,items): return occurrences(target,items) > 0

In this case, *occurrences* helps *find* do its job. We call
such functions, naturally, *helper functions*.
We can improve the efficiency of *find* by having it
perform the search, but short-circuiting the
search once the target item is found. We do this
by turning the first recursive case into a second base case:

def find(target,items): if (items == None): return False elif (head(items) == target): # short-circuit! return True else: return find(target,tail(items))

When the list is empty, we return false because if the item had been list, we would have hit the second base case (and returned true) before hitting the first. If neither base case hits, we simple search the remainder of the list (the recursive case). If the second base case never hits, the first base case eventually will.

Sometimes, we wish to combine two lists. This is easy to do with the append operator:

append(list1,list2)

This places the first element in the second list after the last
element in the first list.
However, many times we wish to intersperse the elements from the
first list with the elements in the second list. This is known
as a *shuffle*, so named since it is similar to shuffling a deck of
cards. When a deck of cards is shuffled,
the deck is divided in two halves (one half is akin to the first
list and the other half is akin to the second list). Next the
two halves are interleaved back into a single deck (akin to
the resulting third list).
Note that while appending is destructive, in that it changes
the first list, shuffling is non-destructive, in the neither
of the shuffled lists are modified.

We can use recursion to shuffle two lists. If both lists are exactly
the same length, the recursive function is easy to implement
using the *accumulate* pattern:

def shuffle(list1,list2): if (list1 == None): return None else: rest = shuffle(tail(list1),tail(list2)) return join(head(list1),join(head(list2),rest))

If *list1* is empty
(which means *list2* is empty since they have the same number of elements),
the function returns the empty, since shuffling nothing together
yields nothing.
Otherwise, we shuffle the tails of the lists (the result is stored
in the temporary variable *rest*), then join the
the first elements of each list to the shuffled remainders.

If you have ever shuffled a deck of cards, you will know that it is
rare for the deck to be split exactly in half prior to the shuffle.
Can we amend our shuffle function to deal with this problem? Indeed,
we can,
by simply placing the extra cards (list items) at the end
of the shuffle. We don't know which list (*list1* or *list2*)
will go empty first,
so we test for each list becoming empty in turn:

def shuffle2(list1,list2): if (list1 == None): return list2 elif (list2 == None): return list1 else: rest = shuffle2(tail(list1),tail(list2)) return join(head(list1),join(head(list2),rest))

If either list is empty, we return the other. Only if both are not empty do we execute the recursive case.

One can make shuffling even simpler by manipulating the recursive call that shuffles the remainder of the lists. If we flip the order of the lists in the recursive call, we only need to deal with the first list becoming empty:

def shuffle3(list1,list2): if (list1 == None): return list2 else: return join(head(list1),shuffle3(list2,tail(list1)))

Note that in the recursive call, we take the tail of *list1* since
we joined the head of *list1* to the resulting shuffle.
The base case returns *list2* because if *list1* is empty,
the "shuffle" of
the remaining elements is just *list2*.
Also, even if *list1* and *list2* are both empty,
the base case test
returns the correct value, `None`

.

Finally, note how much simpler it is to shuffle two lists using recursion as compared to shuffling two arrays with loops.

With the *shuffle* pattern, we always took the head elements from both
lists at each step in the shuffling process.
Sometimes, we wish to place a constraint of the choice of elements.
For example, suppose the two lists to be combined are sorted
and we wish the resulting list to be sorted as well. The
following example shows that shuffling does not always work:

>>> a = ListCreate(1,4,6,7,8,11) >>> b = ListCreate(2,3,5,9) >>> c = shuffle3(a,b) [1, [2, [4, [3, [6, [5, [7, [9, [8, [11, None]]]]]]]]]]

The *merge* pattern is used to ensure the resulting list is
sorted and is based upon the *filtered-accumulate* pattern.
We only accumulate an item *if* it is the smallest item
in the two lists:

def merge(list1,list2): if (list1 == None): return list2 elif (list2 == None): return list1 elif (head(list1) < head(list2)): return join(head(list1),merge(tail(list1),list2)) else: return join(head(list2),merge(list1,tail(list2)))

As with *shuffle2*, we don't know which list will become empty
first, so we check both in turn.

In the first recursive case, the first element of the first list is smaller than the first element of the second list. So we accumulate the first element of the first list and recur, sending the tail of the first list because we have used/accumulated the head of that list. The second list we pass unmodified, since we did not use/accumulate an element from the second list.

In the second recursive case, we implement the symmetric version of the first recursive case, focusing on the second list rather than the first.

The *merge* function in the previous section hard-wired the
comparison operator to `<`

. Many times, the elements
of the lists to be merged cannot be compared with `<`

or, if
they can, a different operator, such as `>`

, might be desired.
The generic merge solves this problem by allowing the caller to
pass in a comparison function as a third argument:

def genericMerge(list1,list2,pred):

where *pred* is the formal parameter that
holds a predicate function^{44}.
Now we replace the `<`

in merge with a call to *pred*.

def genericMerge(list1,list2,pred): if (list1 == None): return list2 elif (list2 == None): return list1 elif (pred(head(list1),head(list2))): return join(head(list1),genericMerge(tail(list1),list2,pred)) else: return join(head(list2),genericMerge(list1,tail(list2),pred))

The *pred* function, which is passed the two head elements, returns
`True`

, if the first element should be accumulated,
and `False`

, otherwise.

We can still use *genericMerge* to merge two sorted lists of numbers
(which can be compared with <) by using the *operator* module. The
*operator* module provides function forms of the
operators `+`

, `-`

,
`<`

, and so on.

>>> import operator >>> a = ListCreate(1,4,6,7,8,11) >>> b = ListCreate(2,3,5,9) >>> c = genericMerge(a,b,operator.lt) [1, [2, [3, [4, [5, [6, [7, [8, [9, [11, None]]]]]]]]]]

The *genericMerge* function is a *generalization* of *merge*.
When we generalize a function, we modify it so it can do what it
did before plus new things that it could not. Here, we can still
have it (genericMerge) do what it (merge) did before, by passing
in the less than comparison operator. We can use it to merge two
lists sorted in increasing order by passing in the greater than
comparison operator.

If a recursive function mistakenly never makes the problem
smaller, the problems is said to be *fossilized*.
Without ever smaller problems, the base case is never reached
and the function recurs^{45}
forever.
This condition is known as an
*infinite recursive loop*. Here is an example:

def factorial(n): if (n == 0): return 1 else: return n * factorial(n)

Since factorial is solving the same problem over and over, *n*
never gets smaller so it never reaches zero.
Fossilizing the problem is a common error made by both novice
and expert programmers alike.

Related to the *fossilized* pattern is the *bottomless* pattern.
With the *bottomless* pattern, the problem gets smaller, but the base
case is never reached. Here is a function that attempts to
divide a positive number by two, by seeing how many times
you can subtract two from the number:^{46}

def div2(n): if (n == 0): return 0 else: return 1 + div2(n - 2)

Things work great for a while:

>>> div2(16) 8 >>> div2(6) 3 >>> div2(134) 67

But then, something goes terribly wrong:

>>> div2(7) RuntimeError: maximum recursion depth exceeded in cmp

What happened? To see, let's *visualize* our function,
as we did with the *gcd* function previously,
by adding a *print* statement:

def div2(n): print("div2: n is",n) if (n == 0): return 0 else: return 1 + div2(n - 2)

Now every time the function is called, both originally and
recursively, we can see how the value of *n* is changing:

>>div2(7) div2: n is 7 div2: n is 5 div2: n is 3 div2: n is 1 div2: n is -1 div2: n is -3 div2: n is -5 div2: n is -7 ... RuntimeError: maximum recursion depth exceeded in cmp

Now we can see why things went wrong, the value of *n*
skipped over the value of zero and just kept on going.
The solution is to change the base case to catch odd
(and even) numbers:

def div2(n): if (n < 2): return 0 else: return 1 + div2(n - 2)

Remember, when you see a recursion depth exceeded error, you likely have implemented either the fossilized or the bottomless pattern.

For Linux and Mac users, the code from this chapter can be found here.

For Windows users, look here
(you will need to save the file as *recursion.py*).

Contents |