Recursion
| Contents |

You can download the functions defined or used in this chapter with the following commands:

wget troll.cs.ua.edu/ACP-C/recursion.c wget troll.cs.ua.edu/ACP-C/recursion.h

These files will help you run the test code listed in this chapter.

In
the chapter on conditionals,
we learned about *if* statements.
When we combine *if* statements with functions that call
themselves, we obtain a powerful programming paradigm
called *recursion*.

Recursion is a form of looping; when we loop,
we evaluate code over and over again. We use a conditional
to decide whether to continue the loop or to stop.
Recursive loops are often
easier to write and understand, as compared to the iterative loops
such as *while*s and *for*s, which you learned
about in a previous chapter.
In some programming
languages, iterative loops are preferred as they use much
less computer memory and are slightly faster as compared to recursive loops.
In other languages, this is not the case at all.
In general, there
is no reason to prefer iterative loops over recursive ones,
other than this memory issue (for some languages)
and slight loss of performance.
Any iterative loop can be written as a recursion and any recursion
can be written as an iterative loop.
Use a recursion if that makes the implementation
more clear, otherwise, use an iterative loop.

Many mathematical functions are easy to implement recursively, so
we will start there. Recall that the factorial of a number
*n* is:

[Equation 1] *n*! = *n* * (*n* - 1) * (*n* - 2) * ... * 2 * 1

Consider writing a function which computes the factorial of a positive integer. For example, if the function were passed the value of 4, it should return the value of 24 since 4! is 4*3*2*1 or 24. To apply recursion to solve this problem or any problem, for that matter, it must be possible to state the solution of a problem so that it references a simpler version of the problem. For factorial, the factorial of a number can be stated in terms of a simpler factorial. Consider the two equations below:

[Equation 2] *n*! = 1 if *n* == 0

[Equation 3] *n*! = *n* * (*n* - 1)! if *n* > 0

Equation
2
says that the factorial
of zero is one^{42}.
Equation
3
states that the factorial
of any other (positive) number is obtained by
multiplying the number by the factorial of one less than that
number. Together, these two equations form a
*recurrence equation* that describes the computation
of factorial.

After some study, you should be able to see that this
new way of describing factorial is equivalent to
Equation
1,
the one that that used
ellipses^{43}.
Equation
1,
gets the basic idea of factorial
across but is not very precise. For example, how would you compute
the factorial of three using Equation
1?

The second form with the two equations is particularly well suited for implementation as a function in a computer program:

int factorial(int n) { if (n == 0) return 1; else return n * factorial(n - 1); }

Note how the *factorial* function
precisely implements the recurrence equation.
Convince yourself that the function really works by tracing the function call:

factorial(3)

Decomposing the call, we find that:

factorial(3) is 3 * factorial(2)

since *n*, having a value of 3, is not equal to 0.
Thus the second block of the *if* is evaluated and
we can replace *n* with 3 and *n-1* with 2.
Likewise, we can replace `factorial(2)`

by `2 * factorial(1)`

, yielding:

factorial(3) is 3 * 2 * factorial(1)

since *n*, now having a value of 2, is still not zero.
Continuing along this vein, we can replace
`factorial(1)`

by `1 * factorial(0)`

, yielding:

factorial(3) is 3 * 2 * 1 * factorial(0)

Now in this last call to factorial,
*n* does have a value of zero, so we can replace
`factorial(0)`

with its immediate return value of one:

factorial(3) is 3 * 2 * 1 * 1

Thus, `factorial(3)`

has a value of six. The program:

//test (compile with recursion.c) #include "recursion.h" printf("factorial(3) is %d\n",factorial(3));

yields:

factorial(3) is 6

as expected.

In contrast, here is a function which computes factorial using
a *for* loop:

int factorial2(int n) { int i; int total = 1; for (i = 1; i < n + 1; ++i) total = total * i; return total; }

We can see from this version that factorial is an accumulation. This
version
also more closely follows
Equation
1.
Note that we have to extend the upper end of the range by one in order to
get *n* included in the accumulation.
By habit,
we like the upper limit of *for* loops always to be exclusive.

Recursive approaches rely on the fact that it is
usually simpler to solve a smaller problem than a
larger one. In the factorial problem, trying to
find the factorial of *n - 1* is a little bit simpler^{44}
than finding the factorial of *n*.
If finding the
factorial of *n - 1* is still too hard to solve easily,
then find the factorial of *n - 2* and so on until
we find a case where the solution is dead easy.
With regards to factorial, this is when *n* is equal to
zero. The *easy-to-solve* code (and the values that get
you there) is known as the *base* case. The
*find-the-solution-using-a-simpler-problem* code (and the values
that get you there) is known as the *recursive* case.
The recursive case usually contains a call to the very
function being executed. This call is known as a
*recursive* call.

Most well-formed recursive functions are composed of
at least one *base* case and at least one *recursive* case.

Consider finding the greatest common divisor, the *gcd*, of two numbers.
For example, the *gcd* of 30 and 70 is 10, since 10 is the largest
number that divides both 30 and 70 evenly.
The ancient Greek philosopher Euclid devised a solution for
this problem that involves
repeated division.
The first division divides the two
numbers in question, saving the remainder. Now the divisor becomes
the dividend and the
remainder becomes the divisor. This process is repeated until
the remainder becomes zero.
At that
point, the current divisor is the *gcd*.
We can specify this as a recurrence equation, with this last bit about
the remainder becoming zero as our base case:

gcd(a,b) | is | b | if a divided by b has a remainder of zero |

gcd(a,b) | is | gcd(b,a % b) | otherwise |

In this recurrence equation,
*a* and *b* are the dividend and the divisor, respectively.
Recall that the modulus operator `%`

returns the remainder.
Using the recurrence equation as a guide, we can easily implement
a function for computing the *gcd* of two numbers.

int gcd(int dividend,int divisor) { if (dividend % divisor == 0) return divisor; else return gcd(divisor,dividend % divisor); }

Note that in our implementation of *gcd*, we used more descriptive
variables than *a* and *b*.
We can improve this function further, by noting that
the remainder is computed twice, once for the
base case and once again to make the recursive call.
Rather than compute it twice, we compute it straight off
and save it in an aptly named variable:

int gcd2(int dividend,int divisor) { int remainder = dividend % divisor; if (remainder == 0) return divisor; else return gcd2(divisor,remainder); }

Look at how the recursive version turns the *divisor* into
the *dividend*
by passing *divisor* as the first argument in the recursive
call.
By the same token, *remainder* becomes *divisor* by nature
of being the second argument in the recursive call.
To convince one's self that the routine really works,
one can modify *gcd* to "visualize" the arguments. On simple
way of visualizing the action of a function is to
add a print statement:

int gcd2(int dividend,int divisor) { int remainder = dividend % divisor; printf("gcd: %2d, %2d, %2d\n",dividend,divisor,remainder); if (remainder == 0) return divisor; else return gcd2(divisor,remainder); }

With the instrumented definition and this code:

//test (compile with recursion.c) #include "recursion.h" printf("%d\n",gcd2(66,42));

we get the following output:

gcd: 66, 42, 24 gcd: 42, 24, 18 gcd: 24, 18, 6 gcd: 18, 6, 0 6

Note, how the first remainder, 24, keeps shifting
to the left.
In the first recursive call, the remainder becomes
*divisor*, so the 24 shifts one spot to the left. On
the second recursive call, the current *divisor*,
which is 24,
becomes the *dividend*,
so the 24 shifts once again to
the left.

We can also write a iterative (and instrumented) version of *gcd*:

int gcd3(int dividend,int divisor) { while (divisor != 0) { int temp = dividend % divisor; printf("gcd: %2d, %2d\n",divisor,dividend); dividend = divisor; divisor = temp; } return dividend; }

While the iterative version of factorial was only slightly more
complicated than the recursive version, with *gcd*, we start to
see more of an advantage using the recursive formulation. For instance,
where did the *temp* variable come from and why is it necessary^{45}?

A third example of recursion is the computation of the
*n ^{th}* Fibonacci number.
The Fibonacci series looks like this:

n : 0 1 2 3 4 5 6 7 8 ... Fibonacci(n) : 0 1 1 2 3 5 8 13 21 ...

and is found in nature again and again^{46}.
From this table, we can see that the *7 ^{th}* Fibonacci number is
13.
In general, a Fibonacci number is equal to the sum of the previous
two Fibonacci numbers. The exceptions are the zeroth and the first
Fibonacci numbers which are equal to 0 and 1 respectively. Voila! The
recurrence case and the two base cases have jumped right out at us! Here,
then, is a recurrence equation which describes the computation of the

fib(n) | is | 0 | if n is zero |

fib(n) | is | 1 | if n is one |

fib(n) | is | fib(n - 1) + fib(n - 2) | otherwise |

Again, it is straightforward to convert the recurrence equation into a working function:

//compute the nth Fibonacci number //n must be non-negative! int fibonacci(int n) { if (n == 0) return 0; else if (n == 1) return 1; else return fibonacci(n-1) + fibonacci(n-2); }

Our implementation is straightforward and elegant. Unfortunately, it's
horribly inefficient in C. Unlike our recursive version of
*factorial* which
recurred about as many times as the size of the number sent to the function,
our recursive version of Fibonacci
will recur many, many more times than the size of
its input. Here's why.

Consider the call to `fib(6)`

.
Tracing all the recursive calls to *fib*, we get:

fib(6) is fib(5) + fib(4)

Replacing `fib(5)`

with `fib(4) + fib(3)`

,
we get:

fib(6) is fib(4) + fib(3) + fib(4)

We can already see a problem, we will compute `fib(4)`

twice,
once from the original call to `fib(6)`

and again when we
try to find `fib(5)`

.
If we write down all the recursive calls generated by `fib(6)`

with each recursive call indented from the previous, we
get a structure that looks like this:

fib(6) fib(5) fib(4) fib(3) fib(2) fib(1) fib(0) fib(1) fib(2) fib(1) fib(0) fib(3) fib(2) fib(1) fib(0) fib(1) fib(4) fib(3) fib(2) fib(1) fib(0) fib(1) fib(2) fib(1) fib(0)

We would expect, based on how the Fibonacci sequence is
generated,
to take about six "steps" to calculate `fib(6)`

.
Instead,
ultimately there were 13 calls^{47}
to either
`fib(1)`

or `fib(0)`

.
There was a tremendous amount of duplicated
and, therefore, wasted effort. An important part
of Computer Science is figuring out how to reduce the wasted effort.
One way to keep the recursive nature without
the penalty of redundant computations
is to *cache* previously computed values.
Another way is to use an iterative loop:

int fib(int n) { int i; int previous = 0; int current = 1; for (i = 0; i < n; ++i) { int temp = previous; previous = current; current = previous + temp; } return previous; }

Here, the recursive version, for all its faults, is much, much easier
to understand. Complications of the iterative version include, why
is *previous* returned and not *current*?
And where did the variable *temp*
come from?
In the
chapter that compares recursion to loops, we will
see a way to combine the clarity of recursion with the
efficieny of loops.

Recursion
| Contents |