Sketchy LISP |
Copyright (C) 2006,2007
Nils M Holm Buy a copy at Lulu.com |
An Introduction to Functional Programming in Scheme
| Previous Section | - Contents - Index - | Next Section |
Earlier in this book, the self-application combinator S was introduced as a device for the construction of anonymous recursive functions:
(lambda (f x) (f f x))
A combinator is a function that
does not make use of any free variables, which is why it cannot even use
basic operations like + or car. So it is basically
limited to binding constructs, function applications, and its (bound)
variables.
While the self-application combinator works fine, it has one limitation. It requires two arguments: the function to be applied and the value the function is to be applied to. It would be much more elegant, if it could just transform a function into a recursive function. Such an operator would turn the following factorial function (which does not work because (lambda (x) ...) closes over the free variable f) into a working recursive function R:
(lambda (f)
(lambda (x) ----------------> R
(cond ((zero? x) 1) mystery
(else (* f (- x 1)))))) transformation
Indeed a function performing this mystery transformation
exists. It is called the
Y combinator (Y),
and it turns any function f of the above form into a
recursive function f':
(Y f) => f'
F' is also called the
fixed point of f, which
is why Y is also known as the fixed point operator
.
The fixed point of a higher order function f is a function
f' which is mapped to itself by f:
(f f') => f'
Because the Y combinator computes the fixed point of a function f,
(Y f) = (f (Y f)) = (f (f (Y f))) ...
In the remainder of this section, the definition of a recursive
function using letrec will be transformed into an
equivalent definition that employs the Y combinator instead
of letrec. A definition of the Y combinator
itself will be a by-product of this transformation.
Note that most of the steps (exceptions: 11,12,13,14) during the transformation still implement the factorial function. You can try them in a Scheme environment, if you want to.
Here is the expression to be transformed. It contains a definition
of a recursive factorial function using letrec:
; The original factorial function f
(letrec
((f (lambda (a)
(cond ((zero? a) 1)
(else (* a (f (- a 1))))))))
(f 5))
=> 120
The use of letrec can easily be avoided by adding an
argument to f that is used to carry a copy of the function
itself. It is no coincidence that this argument is also called
f; it names the function locally. Applications
of f have to pass along the additional argument, so
(f x) becomes (f f x):
; Step 1: replace LETREC with self-application
(let
((f (lambda (f a)
(cond ((zero? a) 1)
(else (* a (f f (- a 1))))))))
(f f 5))
Let can be transformed into an application of a
lambda function, which is done in the following step:
; Step 2: transform LET to LAMBDA
((lambda (f) (f f 5))
(lambda (f a)
(cond ((zero? a) 1)
(else (* a (f f (- a 1)))))))
The above version applies the constant 5 to the factorial function. Because a real factorial function would be applicable to any positive value, though, an argument is added to receive that value:
; Step 3: make argument a variable
((lambda (f x) (f f x))
(lambda (f a)
(cond ((zero? a) 1)
(else (* a (f f (- a 1))))))
5)
The above step results in a two-argument factorial function that expects a copy of itself as well as the value whose factorial is to be computed. To convert the two-argument function into a single-argument function, a method known as currying is applied.
By currying a two-argument function, that function is turned into a higher-order single-argument function. Applying a curried two-argument function to the first argument and then applying the resulting function to the second argument is equivalent to applying the original function to two arguments. Here is an example. Currying the function
(define (add a b) (+ a b))
gives
(define (add-n a) (lambda (b) (+ a b)))
Applying add-n to a value evaluates to a single-argument function that adds the given value to its argument:
(add-n 5) ; This procedure will add 5 to its argument: => #<procedure (b)>
Applying this function to a second argument yields the same result as the application of the original function to the same values at once:
(add 5 7) => 12
((add-n 5) 7) => 12
Currying the above version of the factorial function separates the function f from its argument a, just like the above example separated the first argument (5) from the second argument (7):
; Step 4: curry the lambda functions
(((lambda (f) (lambda (x) ((f f) x)))
(lambda (f)
(lambda (a)
(cond ((zero? a) 1)
(else (* a ((f f) (- a 1))))))))
5)
The factorial function (lambda (f a) ...) turned into (lambda (f) (lambda (a) ...)), self application turned into (lambda (f) (lambda (x) ...)), and applications of the form (f f x) turned into ((f f) x). When (lambda (f) (lambda (a) ...)) is applied to itself, it closes over its own copy, giving a single-argument function that is capable of applying itself.
The function resulting from this transformation can now be separated from its argument:
; Step 5: extract the function
((lambda (f) (lambda (x) ((f f) x)))
(lambda (f)
(lambda (a)
(cond ((zero? a) 1)
(else (* a ((f f) (- a 1))))))))
This expression looks fine, but it has one minor flaw: f has to apply itself each time it wants to recurse. The remaining transformations will draw this self-application into the Y combinator itself, so that recursive applications of the factorial function will be ordinary function applications.
The first step to achieve this goal is to give self-application a name (s):
; Step 6: give self-application a name
((lambda (f) (lambda (x) ((f f) x)))
(lambda (f)
(let ((s (lambda (x) ((f f) x))))
(lambda (a)
(cond ((zero? a) 1)
(else (* a (s (- a 1)))))))))
After naming self-application s, the recursive function
invocation looks like an ordinary function application. The let
that is used to name self-application is now transformed into the application
of a lambda function:
; Step 7: transform LET to LAMBDA
((lambda (f) (lambda (x) ((f f) x)))
(lambda (f)
((lambda (s)
(lambda (a)
(cond ((zero? a) 1)
(else (* a (s (- a 1)))))))
(lambda (x) ((f f) x)))))
The next step is to give the factorial function a name. This might look a bit counter-intuitive at a first glance, because this whole exercise is about the creation of anonymous recursive functions, but it is a necessary intermediate step. The factorial function is named g:
; Step 8: give the factorial function a name
((lambda (f) (lambda (x) ((f f) x)))
(lambda (f)
(let ((g (lambda (s)
(lambda (a)
(cond ((zero? a) 1)
(else (* a (s (- a 1)))))))))
(g (lambda (x) ((f f) x))))))
In the next step, let is once more transformed into
lambda, giving an expression that may cause serious
headache when examining it. Here are some hints: The
(lambda (s) ...) part still introduces the
factorial function, and the (lambda (g) ...)
part binds that function to g and applies it to the
function implementing self-application. Self-application is
thereby bound to s.
; Step 9: transform LET to LAMBDA
((lambda (f) (lambda (x) ((f f) x)))
(lambda (f)
((lambda (g)
(g (lambda (x) ((f f) x))))
(lambda (s)
(lambda (a)
(cond ((zero? a) 1)
(else (* a (s (- a 1))))))))))
In fact, the above expression comes pretty close to a definition of
the Y combinator. It only has to be isolated. This is done by
eta conversion
. The parts of the expression that are added by this
conversion are emphasized using boldface characters in the following
step.
; Step 10: eta-expand Y
((lambda (r)
((lambda (f) (lambda (x) ((f f) x)))
(lambda (f) ((lambda (g)
(g (lambda (x) ((f f) x))))
r))))
(lambda (s)
(lambda (a)
(cond ((zero? a) 1)
(else (* a (s (- a 1))))))))
Eta conversion works in two directions. It converts a function into an equivalent lambda function and vice versa:
(lambda (x) (f x)) <-------> f
eta
The transformation of a lambda function to an equivalent function is called eta reduction, the reverse operation is called eta expansion. The conversion performed in step 10 is an eta expansion.
After performing this conversion, the expression can be decomposed into two parts: the factorial function and the Y combinator. Removing the outermost application and the factorial function yields the Y combinator:
; Step 11: extract Y
(lambda (r)
((lambda (f) (lambda (x) ((f f) x)))
(lambda (f) ((lambda (g)
(g (lambda (x) ((f f) x))))
r))))
Each expression of the form ((lambda (x) t) v) can be reduced to t', where t' equals t with each free occurrence of x replaced by v:
((lambda (x) (x x)) v) --------> (v v)
beta
This conversion is called beta reduction. It is exactly this conversion which substitutes arguments for variables when applying a Scheme function. In the next step, the application of (lambda (g) ...) is beta-reduced by substituting r for g in (g (lambda (x) ((ff) x))):
; Step 12: remove lambda(g) by beta reduction (lambda (r) ((lambda (f) (lambda (x) ((f f) x))) (lambda (f) (r (lambda (x) ((f f) x))))))
Finally, the first self-application in the expression is simplified by eta reduction. (In case you wonder why the second self-application is not reduced: this will be explained in the following sub-section.) The result of this step is a definition of the Y combinator:
; Step 13: remove first lambda(x) by eta-reduction (lambda (r) ((lambda (f) (f f)) (lambda (f) (r (lambda (x) ((f f) x))))))
To facilitate further experiments with this combinator, it will be named Y:
; Step 14: name the combinator (define (Y r) ((lambda (f) (f f)) (lambda (f) (r (lambda (x) ((f f) x))))))
The following definition binds f to the factorial function
of step 8. As you can see, it employs neither letrec nor
self-application:
(define (f s)
(lambda (a)
(cond ((zero? a) 1)
(else (* a (s (- a 1)))))))
Given these definitions of Y and f:
(Y f) => #<procedure (a)> ((Y f) 5) => 120 ((f (Y f)) 5) => 120 ((f (f (Y f))) 5) => 120 ...
So
Lambda calculus (LC) is
the basis of all programming languages of the LISP family. Above
conversion rules, like eta conversion or beta reduction, actually
belong to lambda calculus, but many of these rules apply to Scheme
as well. In a way, Scheme may be considered a kind of applied
lambda calculus
. This is why the Y combinator can be
implemented in both LC and Scheme.
This sub-section will explain how the Y combinator works and why the second self-application of the combinator may not be eta-reduced in Scheme.
In fact, the operation performed by Y is rather simple:
Y is a function of a function f that applies
self-application to the application of f to self-application.
Although this description is brief and correct, an example may be more helpful. The following reduction shows how the anonymous factorial function (which already has been stretched a bit far at this point) is converted into a recursive function:
((lambda (r)
((lambda (f) (f f))
(lambda (f) (r (lambda (x) ((f f) x))))))
(lambda (s)
(lambda (a)
(cond ((zero? a) 1)
(else (* a (s (- a 1))))))))
In the first step, the factorial function [(lambda (s) ...)] replaces the variable r in the body of the combinator by beta reduction:
; beta-reduce ((lambda (r) ...) ...)
-> ((lambda (f) (f f))
(lambda (f) ((lambda (s)
(lambda (a)
(cond ((zero? a) 1)
(else (* a (s (- a 1)))))))
(lambda (x) ((f f) x)))))
In the following step, the factorial function is applied to self-application. This step makes (lambda (a) ...) close over s, giving a function that is capable of performing self-application. For improved readability, (lambda (x) ((f f) x)) is eta-reduced to (f f) in the same step:
; beta-reduce ((lambda (s) ...) ...)
; eta-reduce (lambda (x) ((f f) x))
-> ((lambda (f) (f f))
(lambda (f)
(lambda (a)
(cond ((zero? a) 1)
(else (* a ((f f) (- a 1))))))))
Finally, the resulting function is applied to itself. It thereby closes over its own copy, leaving a function that applies itself to itself. This transformation is a bit tricky, because f is bound in different contexts. To avoid this ambiguity when performing beta reduction manually, lambda calculus provides a rule known as alpha conversion.
Alpha conversion replaces the names of variables that are bound in a function with names that are not bound in the argument of that function. More formally: given the expression
((lambda (x) t) v)
it replaces all occurences of x in (lambda (x) t) with a name that does not occur in v. This transformation is safe because changing the name of a bound variable does not change the meaning of an expression:
(lambda (f) (f f)) = (lambda (x) (x x))
Scheme environments do not have to do alpha conversion because computers are good at keeping track of different instances of the same symbol, but to humans dealing with expressions like the above, it is a valuable aid:
; alpha-convert (lambda (f) (f f))
-> ((lambda (x) (x x))
(lambda (f)
(lambda (a)
(cond ((zero? a) 1)
(else (* a ((f f) (- a 1))))))))
Above self-application makes the function (lambda (a) ...) close over f:
; beta-reduce ((lambda (x) (x x)) ...)
=> (lambda (a)
(cond ((zero? a) 1)
(else (* a ((f f) (- a 1))))))
where f = (lambda (f)
(lambda (a)
(cond ((zero? a) 1)
(else (* a ((f f) (- a 1)))))))
Each time the function chooses to recurse, (lambda (f) ...) is applied to itself one more time, thereby creating another instance of the above expression. So the reduction of this expression may cycle, effectively implementing recursion.
Although the Y combinator only has been used to create the factorial function in this section, it is of course not limited to this function. Any function of the form
(lambda (f) (lambda (x) body))
is mapped to its fixed point by Y:
((Y (lambda (f) (lambda (x) x))) 'foo) => foo
The function transformed by Y may use f to recurse (although f is named length in this example):
((Y (lambda (length)
(lambda (a)
(cond ((null? a) 0)
(else (+ 1 (length (cdr a))))))))
'(a b c d e))
=> 5
A final question remains. Why was the second self-application of Y not simplified using eta reduction? This would have lead to the following Y combinator:
(define (Ylc r) ((lambda (f) (f f)) (lambda (f) (r (f f)))))
Indeed, this is exactly the way the Y combinator is defined in lambda calculus, but it does not work in Scheme, so there has to be a subtle difference between LC and Scheme. The point where the systems differ is their type of evaluation.
While Scheme uses call by value, lambda calculus employs call by name, except it is called normal order evaluation there. (And Scheme's evaluation would be called applicative order evaluation in LC).
When using the Y combinator of LC in Scheme, indefinite recursion occurs:
(Ylc (lambda (f) (lambda (x) x))) => bottom
This happens because self-application [(f f)] is applied on the spot in the combinator instead of passing it to r.
(Ylc (lambda (f) (lambda (x) x)))
; do not care about R; it is not reduced anyway
-> ((lambda (f) (f f))
(lambda (f) (r (f f)))))
-> ((lambda (f) (r (f f)))
(lambda (f) (r (f f))))
-> (r ((lambda (f) (r (f f)))
(lambda (f) (r (f f)))))
-> (r (r ((lambda (f) (r (f f)))
(lambda (f) (r (f f)))))
...
When you try Ylc in Scheme, the program will finally run out of memory. Eta-expanding the second self-application simulates call-by-name semantics in Scheme and thereby avoids the above indefinite recursion.
The Y combinator demonstrates the full power of higher
order functions by implementing generic recursion using
lambda and function application exclusively. While lambda
calculus is a rather theoretical device, Scheme bridges the
gap between mathematics and programming. It combines mathematical
rigor with the tools that a programmer needs for exploring problems
and inventing new algorithms.
| Previous Section | - Contents - Index - | Next Section |