t3x.org / sketchy / vol1 / sl19.html

Sketchy LISP

  By Nils M Holm, 2006,2007,2008
Buy a copy at Lulu.com

An Introduction to Functional Programming in Scheme

3.3 Tail-Recursive Programs

Earlier in this book, a tail-recursive program was described as a program that does not do anything after calling itself. However, this is only half of the truth, as even this trivial example illustrates:

(define (down n)
  (cond ((zero? n) 0)
        (else (down (- n 1)))))

When stating that the application of down is the last thing that down does in the general case, cond was silently excluded. Of course, cond is required to distinguish between the general case and the trivial case, so it has to be an exception, or tail-recursion could not work at all. It does work because a cond with a true predicate can be re-written this way:

(cond (#t expr)) = expr

As soon as a true predicate is reached, there is nothing left to do for cond. This is why bodies of cond are safe positions for tail calls. A tail call is a call to a procedure that occurs right before the calling procedure returns. When a procedure uses tail calls exclusively to implement recursion, it is tail-recursive.

The predicates of cond are not valid positions for tail calls, because cond has to examine the normal form of each predicate in order to decide whether the corresponding body is to be evaluated. In the following example, the first recursive call to equ? is not a tail call (because it is a predicate of cond), but the second one is (because it is a body of cond):

(define (equ? a b)
  (cond ((or (not (pair? a))
             (not (pair? b)))
          (eq? a b))
        ((equ? (car a) (car b))
          (equ? (cdr a) (cdr b)))
        (else #f)))

There are other positions that are safe for tail calls. The last argument of and is one such position. The last two clauses of the above equ? procedure can be replaced by an application of and:

(define (equ? a b)
  (cond ((or (not (pair? a))
             (not (pair? b)))
          (eq? a b))
        (else (and (equ? (car a) (car b))
                   (equ? (cdr a) (cdr b))))))

The second application of equ? is still a tail call. The other arguments of and are not safe, because and may have to examine all of its arguments before it can return.

The same is true with or and begin: their last arguments are safe, their others are not. In general, all expressions that can be re-written in such a way that only the tail call remains are safe:

        (begin (f))  =  (f)
           (or (f))  =  (f)
          (and (f))  =  (f)
    (cond (#t (f)))  =  (f)
(case 'x ((x) (f)))  =  (f)
    (if #t (f) (g))  =  (f)
    (if #f (f) (g))  =  (g)

In if, both the consequent and the alternative are safe, because if the predicate is true, the consequent is safe and the alternative is ignored, and if the predicate is false, the consequent is ignored and the alternative is safe.

The fact that the last positions of and, or, and begin are safe can easily be derived from their definitions, as will be demonstrated here using the example of and.

And evaluates its arguments a1...an in sequence. When one of the arguments a1...an-1 evaluates to #f, the last argument an is never evaluated, so it does not matter whether it is a tail call or not. What remains is the case that all arguments up to the last one evaluate to #t. In this case,

  (and a1 a2 ... aN)
= (and a2 ... aN)
= (and aN)
= aN

Showing that the last position of or is safe works in an analogous way. The last position of begin is trivially safe because

  (begin a1 ... aN)
= (begin aN)
= aN                ; modulo side effects

Note that the fact that only the last expression of begin is a tail call position has consequences for all constructs that use an implied begin. Only the last expression in the bodies of lambda, cond, case, etc is in a tail call position. In the expression

(cond (else (f) (g) (h)))

the calls to f and g are no tail calls, only the call to h is one.

Although this might not be obvious, the bodies of the binding constructs let, let*, and letrec are also safe. The following version of down is tail-recursive:

(define (down n)
  (let ((m (- n 1)))
    (cond ((zero? n) 0)
          (else (down m)))))

That this procedure is tail-recursive can easily be shown by re-writing let to the application of a lambda function:

(define (down n)
  ((lambda (m)
     (cond ((zero? n) 0)
           (else (down m))))
   (- n 1)))

The anonymous procedure application is in a tail call position, because it is the last thing to be evaluated in down. The recursive call to down is also in a tail call position, because it is the last expression in a body of cond. Ergo this version of down is tail-recursive.

A tail-recursive program may very well consist of multiple procedures which call each other. Recursion involving multiple procedures is called mutual recursion or indirect recursion. The following example implements the even-p procedure, which checks whether a number is even, in terms of the odd-p procedure, which checks whether a number is odd, and vice versa:

(define (even-p x)
  (or (zero? x) (odd-p (- x 1))))

(define (odd-p x)
  (if (zero? x) #f (even-p (- x 1))))

Of course this definition works only for positive numbers, but this is a circumstantial detail. What is more interesting is the fact that both even-p and odd-p are tail-recursive. Tail call optimization, which is responsible for turning tail calls into jumps, is not limited to procedures that call themselves directly. Scheme optimizes all tail calls no matter whether they are directly recursive, indirectly recursive, or not recursive at all.