t3x.org / sketchy / vol1 / sl17.html

Sketchy LISP

  Copyright (C) 2006,2007 Nils M Holm
Buy a copy at Lulu.com

An Introduction to Functional Programming in Scheme

3 Some Missing Pieces

3.1 Syntax Transformation

Throughout this book pseudo functions were described as something that not really is a function but whose application looks like a procedure application. This is only half of the truth, though. Keywords like define, lambda, and quote form the syntax of Scheme. The syntax is the very core of the language. It describes the lexical form of programs. For example, define is not really a procedure, but part of the language itself. This is why just typing define at the Scheme prompt yields a syntax error in all standard-compliant implementations:

define => bottom

This happens because the syntax of define has only two valid forms which are

(define symbol value)

and

(define (symbol variable ...) body)

and just define without any parentheses and arguments is neither of them.

What is particularly interesting about Scheme syntax is that you can extend it just like you can extend the Scheme procedure library with own procedures. The process that maps user-defined syntax to existing syntax is called syntax transformation.

Here is a simple example:

(define-syntax when
  (syntax-rules ()
    ((_ predicate . commands)
       (if predicate (begin . commands)))))

In this example define-syntax is used to create a new pseudo function named when. Whenever Scheme finds an application of when, it re-writes it according to the syntax-rules associated with it. Although other variants exist, the body of define-syntax always consists of an instance of syntax-rules in standard Scheme.

Syntax-rules provides a set of rules that is used to transform the application of a pseudo function. Each rule consists of two parts, a pattern and a template:

(syntax-rules ()
  (pattern1 template1)
  ...
  (patternn templaten))

In the above example, the only pattern is

(_ predicate . commands)

and the associated template is

(if predicate (begin . commands))

During the transformation, the application of a pseudo function is matched against each supplied pattern. The template of the first matching pattern is used to re-write the application.

3.1.1 Pattern Matching

A pattern P is matched against a form F as follows:

If P is a symbol, it matches any form F. For example,

x matches 123
x matches "hello world"
x matches (foo bar)

If P is an atom other than a symbol, F must be exactly the same atom. For example,

          123 matches 123
"hello world" matches "hello world"
          #\q matches #\q

If P is a keyword, F must be the same keyword (keywords will be explained below).

If P is a pair, F must also be a pair and the car and cdr parts of P must match the car and cdr parts of F. For example,

  (1 . 2) matches (1 . 2)
  (x . 1) matches (foo . 1)        ; x matches anything
  (1 . x) matches (1 . (bar baz))  ; x matches anything
  (1 . x) matches (1 bar baz)      ; same as above!
(1 (2) 3) matches (1 (2) 3)

If an ellipsis (the symbol ...) occurs in a list, it matches any number of forms. For example,

(x ...) matches (1)
(x ...) matches (1 #\x)
(x ...) matches (1 #\x (foo))
(x ...) matches (1 #\x (foo) "hi")

Whenever a symbol of a pattern (including ...) matches a form, the symbol is bound to that form. For example:

Matching x       against (foo bar) binds x to (foo bar).
Matching (x ...) against (1 2 3)   binds x to 1 and ... to (2 3).
Matching (x . y) against (a b c)   binds x to a and y to (b c).

The difference between the patterns (x . y) and (x ...) will become clear soon.

3.1.2 Substitution

The application of a pseudo function is re-written by replacing it with the template that is associated with the matching pattern. At the same time, each variable of the template is substituted by the value assigned to it while matching the pattern.

Given the user-defined when syntax

(define-syntax when
  (syntax-rules ()
    ((_ predicate . commands)
       (if predicate (begin . commands)))))

and the application

(when (= 1 1) (display "yes!") (newline))

the pattern of the first (and only) syntax rule

(_ predicate . commands)

matches the application. While matching the application, it binds

_         to when;
predicate to (= 1 1);
commands  to ((display "yes!") (newline)).

The underscore _ is an ordinary symbol that is by convention used to bind the name of the pseudo function itself.

Substituting the symbols predicate and commands by their values in the template

(if predicate (begin . commands))

finally yields

(if (= 1 1) (begin (display "yes!") (newline)))

Note how the dotted pair in the template is used to cons the list of commands to begin:

  (begin . commands)
= (begin . ((display "yes!") (newline)))
= (begin (display "yes!") (newline))

The when pseudo function is similar to if, but instead of an alternative expression it accepts any number of consequent expressions. Because of its implied begin it allows you to write

(when p x1 x2 x3 ...)

instead of

(if p (begin x1 x2 x3 ...))

Another way to implement the when pseudo function is the use of an ellipsis:

(define-syntax when
  (syntax-rules ()
    ((_ predicate command ...)
       (if predicate (begin command ...)))))

This version works in exactly the same way as the version using an improper list. However, the ellipsis has another interesting property that distinguishes it from the dotted list. The fresh pseudo function makes use of this property:

(define-syntax fresh
  (syntax-rules ()
    ((_ (sym ...) expr . exprs)
      (let ((sym (if #f #f)) ...) expr . exprs))))

Fresh creates some fresh local variables without assigning any specific values to them. Such a construct could be useful in the implementation of letrec, for example. The syntax of fresh is similar to the syntax of let, and in fact the above syntax transformer re-writes fresh using let:

(fresh (a b c) (list a b c))

becomes

(let ((a (if #f #f))
      (b (if #f #f))
      (c (if #f #f)))
  (list a b c))

Note that a single symbol precedes the ellipsis in the pattern, but the form

(sym (if #f #f))

precedes the ellipsis in the template. What the syntax transformer does in this case is to replace each form matched by the ellipsis with the above template. In

(fresh (a b c) (list a b c))

the ellipsis matches the symbols a, b, and c. In the template, the ellipsis is replaced with three instances of (sym (if #f #f)). In the first instance, sym is replaced by a, in the second one it is replaced by b, and in the last one by c.

Here is another example:

(define-syntax twins
  (syntax-rules ()
    ((_ x ...)
      (list (quote (x x)) ...))))

The effect of the twins pseudo functions is as follows:

    (twins 1) => ((1 1))
  (twins 1 2) => ((1 1) (2 2))
(twins 1 2 3) => ((1 1) (2 2) (3 3))

3.1.3 Recursive Syntax

Recursion works even in syntax definitions. A syntax transformation terminates only if no more opportunities for a syntax transformation can be found in a resulting form. The following pseudo function reverses a list:

(define-syntax reverse-syntax
  (syntax-rules ()
    ((_ lst) (reverse-syntax lst ()))
    ((_ () r) r)
    ((_ (a . d) r) (reverse-syntax d (a . r)))))

When reverse-syntax is applied to a list, the application is re-written to an application of reverse-syntax to that list and () by rule #1: (the --> arrow means is re-written to)

(reverse-syntax (1 2 3)) --> (reverse-syntax (1 2 3) ())

Because the result contains another application of reverse-syntax, the transformation continues. Rule #3 is applied three times:

(reverse-syntax (1 2 3) ()) --> (reverse-syntax (2 3) (1))
 (reverse-syntax (2 3) (1)) --> (reverse-syntax (3) (2 1))
 (reverse-syntax (3) (2 1)) --> (reverse-syntax () (3 2 1))

At this point, rule #2 is invoked which extracts the second argument:

(reverse-syntax () (3 2 1)) --> (3 2 1)

The resulting list does not contain any applications of syntax transformers, so it is the final result of the transformation. Using reverse-syntax, you can reverse a list before it is evaluated:

(reverse-syntax (7 5 cons)) => (5 . 7)

The following recursive syntax transformer implements the case pseudo function, which is part of the Scheme standard:

(define-syntax case
  (syntax-rules (else)
    ((_ key (else expr ...))
       (begin expr ...))
    ((_ key (data expr ...))
       (if (memv key 'data) 
           (begin expr ...)
           (if #f #f)))
    ((_ key (data expr . exprs)
            more-cases ...)
       (if (memv key 'data)
           (begin expr . exprs)
           (case key more-cases ...)))))

Case is used to select cases based on the value of an expression (which is called the key of case). Like cond, it has one or multiple clauses as arguments:

(case key
      ((d1,1 ...) expr1,1 ...)
      ...
      (else       exprn,1 ...))

When the key is found in a list of data (di,1 ...), the associated expressions are reduced. Like the bodies of cond, the bodies of case imply begin. The else clause, which catches any remaining cases, is optional.

The first thing that stands out in the definition of case is the keyword else in syntax-rules. Any number of keywords can be defined in the list that forms the first argument of syntax-rules. The keywords are local to the syntax transformer being defined.

Rule #1 matches the else keyword and re-writes it as follows:

(case key (else x1 ...)) --> (begin x1 ...)

The second rule makes sure that case reduces to an unspecific value when no matching clause is found.

Rule #3 is a recursive rule that re-writes a case with multiple clauses to an if expression that handles the first clause and whose alternative branch is another application of case that handles the remaining clauses.