Sketchy LISP |
By Nils M Holm,
2006,2007,2008 Buy a copy at Lulu.com |
An Introduction to Functional Programming in Scheme
| Previous Section | - Contents - Index - | Next Section |
Typical Scheme procedures have much in common with math functions. Their sole purpose is to map a number of values (its arguments) to another value, the function result. For example the math function
f(x) := x2
would look like this in Scheme:
(define (f x) (* x x))
But of course you would give it a more descriptive name, like square. Even the fact procedure given in the previous section is modelled after a math function:
0! := 1
n! := n × (n-1)!
But instead of declaring multiple instances of the function that
handle different argument values, like in the math definition above,
cond decides which path to take:
(define (fact n)
(cond
; (fact 0) := 1
((zero? n) 1)
; (fact n) := (* n (fact (- n 1)))
(#t (* n (fact (- n 1))))))
You probably have noticed that the definition of fact is self-referential. It applies itself in order to compute its own result. This is by no means contradictory. It is a well-known principle known as recursion. A recursive procedure has (at least) two cases: the trivial case and the general case (also called the recursive case). Above definition could be pronounced as
The factorial of N is
-- 1, if N equals 0
-- N times the factorial of N-1, otherwise.
The condition if N equals 0
is very specific, while
the condition otherwise
is rather general. Also, the
first case returns the value 1 which is rather trivial, while the
value of the second case is computed recursively and hence not
that trivial.
As it happens, the trivial case of a recursive procedure is exactly what you would specify as an exit condition in a loop of an imperative program, as the following C program illustrates:
/* Imperative fact() in C */
int ifact(int n) {
int r = 1;
while (!(n == 0)) {
r = r * n;
n = n-1;
}
return r;
}
The loop in this program reads: while the exit condition (n==0)
is not true, repeat the general case.
However,
this program does not tell you much about the nature of the
fact procedure. All it does is to shuffle some state until
an exit condition is met and then return some of that state. Reducing
ifact(3) on a sheet of paper would be a tedious exercise.
What is more interesting, though, is that iteration and recursion seem to be equivalent to some degree. In fact, this equivalence is strong enough that Scheme expresses any repetition using recursion. This approach has some advantages:
An issue seems to be, though, that recursion looks inefficient in some situations. If you are used to imperative languages, you may have heard that recursion is bad because it slows down programs and may even break things (for example by overflowing the runtime stack). Relax. In Scheme this is not the case.
Both the recursive C program and the recursive Scheme program
call fact for the same number of times. (To call a procedure
is a more technical term for applying a function.) Each call causes
the space consumed by the program to grow by some constant amount.
The total amount is this constant times the size of the argument:
(fact 5) adds five levels of recursion and
(fact 100) adds one hundred levels.
The ifact function, on the other hand, does not add
anything. It does so by cheating. It adds some state, namely the
variable r, to hold the intermediate result of the
computation. Scheme would be a boring language, if it would not
support the same sort of cheat:
(define (fact2 n r)
(cond ((zero? n) r)
(#t (fact2 (- n 1) (* n r)))))
The fact2 procedure still recurses, but its space remains constant while it evaluates:
(fact2 5 1) -> (fact2 4 5) -> (fact2 3 20) -> (fact2 2 60) -> (fact2 1 120) -> (fact2 0 120) => 120
What looks fine on a sheet of paper also works on a real CPU. The principle that makes this possible is called tail call optimization. In fact2 the body of the general case is
(fact2 (- n 1) (* n r))
When fact2 returns from the recursive call, the only thing that is left to do for the calling procedure is to pass control back to its own caller. Scheme recognizes this and transforms the call to fact2 into a jump. The flow of control never returns to the caller (which would be fact2 itself), but directly to the caller of the caller.
A program that does not do anything
after calling itself
is called a tail recursive
program. Fact2 is a tail-recursive program, while the
original fact program is not, because it applies *
to the value returned by the recursive call:
(* n (fact (- n 1))) ; ^^^--- this is done after the return of fact
Tail call optimization transforms tail-recursive programs into programs doing ordinary loops. It combines the expressiveness of recursion with the advantages of iteration. If you do not believe this, try the following small program which implements an indefinite loop by omitting the trivial case:
(define (o) (o))
Once started (o) will recurse forever, but its
space will remain constant. This is why tail-recursive Scheme
programs are said to evaluate in constant space
.
All linear-recursive programs (like fact) can be transformed into tail-recursive programs using an additional argument. Note that this transformation does not turn the program into a total mess like the transformation of the C program. The Scheme program still is declarative:
(fact2 0 r) := r (fact2 n r) := (fact2 (- n 1) (* r n))
The only thing that is a bit ugly is that fact2 takes two arguments, where the second one always must be 1. This fact can be easily covered, though, by wrapping a new definition of fact around fact2:
(define (fact n) (fact2 n 1))
If you do not like to write two procedures where only one is
required, letrec
is your friend:
(define (fact n)
(letrec
((fact2
(lambda (n r)
(cond ((zero? n) r)
(#t (fact2 (- n 1) (* n r)))))))
(fact2 n 1)))
Letrec embeds local definitions into a procedure. In
the above example, it defines fact2 locally inside of
fact. Do not worry too much about letrec and
lambda for now. They will be explained soon.
By the way: while ifact(100) is a rather hypothetical construct in C and will deliver anything but the result of 100!, this works fine in Scheme:
(fact 100) => 93326215443944152681699238856266700490715968264381621468 59296389521759999322991560894146397615651828625369792082722 3758251185210916864000000000000000000000000
While cond may give the impression that it is
used to write programs in a declarative way, there is one important
difference: the order of clauses does matter, while the
order of declarations does not. For example, the declarations
0! := 1
n! := n × (n-1)!
are equivalent to
n! := n × (n-1)!
0! := 1
because you would pick the best match before re-writing an
application of the factorial function. Even if the case n!
also covers 0!, you would probably prefer the more special
case, and this would be the right thing to do. Cond is
not that smart. It simply tests its clauses in sequence and
picks the first one with a true predicate. Therefore the following
program will not compute n!:
; A broken version of fact
(define (broken-fact n)
(cond (#t (* n (broken-fact (- n 1))))
; The following part is unreachable!
((zero? n) 1)))
A cure for this problem is to always specify clauses of
cond in order of ascending generality. The most
general clause, usually containing a #t predicate,
must always be the last clause. Remember: clauses after
a true predicate are unreachable. Ordering the clauses of more
complex applications of cond requires some practice,
but not more than formulating complex exit conditions in
imperative languages.
The keyword else,
which is part of the cond syntax, helps to remember
that the catch-all clause should be last:
(define (fact n)
(cond ((zero? n) 1)
(else (* n (fact (- n 1))))))
Whenever else appears in the last clause of
cond, it is equivalent to #t. Its
value is undefined in other clauses:
(cond (else 1)
(else 2)) => bottom
Another interesting fact about cond is that it
is non-strict. If it was strict, it could not work. Remember the
simple procedure o which never terminates:
(define (o) (o))
Once applied, o keeps evaluating for ever
(or until your computer breaks or until you get bored and abort
program execution, whatever comes first). Because there is no way
to obtain the value of (o), it is undefined,
which is the same as bottom:
(o) => bottom
The following expression evaluates to a value, though:
(cond (#f (o)) (#t 1)) => 1
Earlier in this book, the equivalence between cond
and if was illustrated. Given this equivalence, the
above expression could be re-written as
if #f then evaluate to (o) else evaluate to 1
Scheme actually does have an
if
function, which looks like this:
(if predicate consequent alternative)
It is just another way of saying:
(cond (predicate consequent) (#t alternative))
If if was a strict function,
(if #f (o) 1) => bottom
because
(o) => bottom
So if cannot be strict. The syntax of cond
hides the fact that (o) is an argument of cond,
but the equivalence
(cond (p e1) (#t e2)) = (if p e1 e2)
demonstrates that cond cannot be strict, either.
The practical meaning of this property of cond (and
if) is that it is a prerequisite for recursive functions.
The procedure down, which counts down to 0, could not be
implemented using a hypothetical strict if function:
(define (down x)
(if (zero? x)
0
(down (- x 1))))
Of course, if does not really evaluate the general case
to bottom and then ignores it, but it evaluates the general
case if, and only if, the predicate did not hold.
| Previous Section | - Contents - Index - | Next Section |