t3x.org / sketchy / vol1 / sl03.html

Sketchy LISP

  By Nils M Holm, 2006,2007,2008
Buy a copy at Lulu.com

An Introduction to Functional Programming in Scheme

1.3 Loops in Functional Programs

Typical Scheme procedures have much in common with math functions. Their sole purpose is to map a number of values (its arguments) to another value, the function result. For example the math function

f(x) := x2

would look like this in Scheme:

(define (f x) (* x x))

But of course you would give it a more descriptive name, like square. Even the fact procedure given in the previous section is modelled after a math function:

0! := 1
n! := n × (n-1)!

But instead of declaring multiple instances of the function that handle different argument values, like in the math definition above, cond decides which path to take:

(define (fact n)
  (cond
    ; (fact 0) := 1
    ((zero? n) 1)
    ; (fact n) := (* n (fact (- n 1)))
    (#t (* n (fact (- n 1))))))

You probably have noticed that the definition of fact is self-referential. It applies itself in order to compute its own result. This is by no means contradictory. It is a well-known principle known as recursion. A recursive procedure has (at least) two cases: the trivial case and the general case (also called the recursive case). Above definition could be pronounced as

The factorial of N is
-- 1, if N equals 0
-- N times the factorial of N-1, otherwise.

The condition if N equals 0 is very specific, while the condition otherwise is rather general. Also, the first case returns the value 1 which is rather trivial, while the value of the second case is computed recursively and hence not that trivial.

As it happens, the trivial case of a recursive procedure is exactly what you would specify as an exit condition in a loop of an imperative program, as the following C program illustrates:

/* Imperative fact() in C */
int ifact(int n) {
        int r = 1;
        while (!(n == 0)) {
                r = r * n;
                n = n-1;
        }
        return r;
}

The loop in this program reads: while the exit condition (n==0) is not true, repeat the general case. However, this program does not tell you much about the nature of the fact procedure. All it does is to shuffle some state until an exit condition is met and then return some of that state. Reducing ifact(3) on a sheet of paper would be a tedious exercise.

What is more interesting, though, is that iteration and recursion seem to be equivalent to some degree. In fact, this equivalence is strong enough that Scheme expresses any repetition using recursion. This approach has some advantages:

An issue seems to be, though, that recursion looks inefficient in some situations. If you are used to imperative languages, you may have heard that recursion is bad because it slows down programs and may even break things (for example by overflowing the runtime stack). Relax. In Scheme this is not the case.

Both the recursive C program and the recursive Scheme program call fact for the same number of times. (To call a procedure is a more technical term for applying a function.) Each call causes the space consumed by the program to grow by some constant amount. The total amount is this constant times the size of the argument: (fact 5) adds five levels of recursion and (fact 100) adds one hundred levels. The ifact function, on the other hand, does not add anything. It does so by cheating. It adds some state, namely the variable r, to hold the intermediate result of the computation. Scheme would be a boring language, if it would not support the same sort of cheat:

(define (fact2 n r)
  (cond ((zero? n) r)
        (#t (fact2 (- n 1) (* n r)))))

The fact2 procedure still recurses, but its space remains constant while it evaluates:

(fact2 5 1)
-> (fact2 4 5)
-> (fact2 3 20)
-> (fact2 2 60)
-> (fact2 1 120)
-> (fact2 0 120)
=> 120

What looks fine on a sheet of paper also works on a real CPU. The principle that makes this possible is called tail call optimization. In fact2 the body of the general case is

(fact2 (- n 1) (* n r))

When fact2 returns from the recursive call, the only thing that is left to do for the calling procedure is to pass control back to its own caller. Scheme recognizes this and transforms the call to fact2 into a jump. The flow of control never returns to the caller (which would be fact2 itself), but directly to the caller of the caller.

A program that does not do anything after calling itself is called a tail recursive program. Fact2 is a tail-recursive program, while the original fact program is not, because it applies * to the value returned by the recursive call:

 (* n (fact (- n 1)))
; ^^^--- this is done after the return of fact

Tail call optimization transforms tail-recursive programs into programs doing ordinary loops. It combines the expressiveness of recursion with the advantages of iteration. If you do not believe this, try the following small program which implements an indefinite loop by omitting the trivial case:

(define (o) (o))

Once started (o) will recurse forever, but its space will remain constant. This is why tail-recursive Scheme programs are said to evaluate in constant space.

All linear-recursive programs (like fact) can be transformed into tail-recursive programs using an additional argument. Note that this transformation does not turn the program into a total mess like the transformation of the C program. The Scheme program still is declarative:

(fact2 0 r) := r
(fact2 n r) := (fact2 (- n 1) (* r n))

The only thing that is a bit ugly is that fact2 takes two arguments, where the second one always must be 1. This fact can be easily covered, though, by wrapping a new definition of fact around fact2:

(define (fact n) (fact2 n 1))

If you do not like to write two procedures where only one is required, letrec is your friend:

(define (fact n)
  (letrec
    ((fact2
       (lambda (n r)
         (cond ((zero? n) r)
               (#t (fact2 (- n 1) (* n r)))))))
    (fact2 n 1)))

Letrec embeds local definitions into a procedure. In the above example, it defines fact2 locally inside of fact. Do not worry too much about letrec and lambda for now. They will be explained soon.

By the way: while ifact(100) is a rather hypothetical construct in C and will deliver anything but the result of 100!, this works fine in Scheme:

(fact 100)
=> 93326215443944152681699238856266700490715968264381621468
59296389521759999322991560894146397615651828625369792082722
3758251185210916864000000000000000000000000

1.3.1 Cond Revisited

While cond may give the impression that it is used to write programs in a declarative way, there is one important difference: the order of clauses does matter, while the order of declarations does not. For example, the declarations

0! := 1
n! := n × (n-1)!

are equivalent to

n! := n × (n-1)!
0! := 1

because you would pick the best match before re-writing an application of the factorial function. Even if the case n! also covers 0!, you would probably prefer the more special case, and this would be the right thing to do. Cond is not that smart. It simply tests its clauses in sequence and picks the first one with a true predicate. Therefore the following program will not compute n!:

; A broken version of fact
(define (broken-fact n)
  (cond (#t (* n (broken-fact (- n 1))))
        ; The following part is unreachable!
        ((zero? n) 1)))

A cure for this problem is to always specify clauses of cond in order of ascending generality. The most general clause, usually containing a #t predicate, must always be the last clause. Remember: clauses after a true predicate are unreachable. Ordering the clauses of more complex applications of cond requires some practice, but not more than formulating complex exit conditions in imperative languages.

The keyword else, which is part of the cond syntax, helps to remember that the catch-all clause should be last:

(define (fact n)
  (cond ((zero? n) 1)
        (else (* n (fact (- n 1))))))

Whenever else appears in the last clause of cond, it is equivalent to #t. Its value is undefined in other clauses:

(cond (else 1)
      (else 2)) => bottom

Another interesting fact about cond is that it is non-strict. If it was strict, it could not work. Remember the simple procedure o which never terminates:

(define (o) (o))

Once applied, o keeps evaluating for ever (or until your computer breaks or until you get bored and abort program execution, whatever comes first). Because there is no way to obtain the value of (o), it is undefined, which is the same as bottom:

(o) => bottom

The following expression evaluates to a value, though:

(cond (#f (o)) (#t 1)) => 1

Earlier in this book, the equivalence between cond and if was illustrated. Given this equivalence, the above expression could be re-written as

if #f
   then evaluate to (o)
   else evaluate to 1

Scheme actually does have an if function, which looks like this:

(if predicate consequent alternative)

It is just another way of saying:

(cond (predicate consequent) (#t alternative))

If if was a strict function,

(if #f (o) 1) => bottom

because

(o) => bottom

So if cannot be strict. The syntax of cond hides the fact that (o) is an argument of cond, but the equivalence

(cond (p e1) (#t e2))  =  (if p e1 e2)

demonstrates that cond cannot be strict, either.

The practical meaning of this property of cond (and if) is that it is a prerequisite for recursive functions. The procedure down, which counts down to 0, could not be implemented using a hypothetical strict if function:

(define (down x)
  (if (zero? x)
      0
      (down (- x 1))))

Of course, if does not really evaluate the general case to bottom and then ignores it, but it evaluates the general case if, and only if, the predicate did not hold.