Sketchy LISP |
By Nils M Holm,
2006,2007,2008 Buy a copy at Lulu.com |
An Introduction to Functional Programming in Scheme
| Previous Section | - Contents - Index - | Next Section |
Like most popular languages, Scheme is a typed language which means that there are different types of objects, like numbers, booleans, chars, strings, etc. Most contemporary languages like C or Java are statically typed languages, though, while Scheme uses a system called dynamic typing. In statically typed languages, types are associated with variables. A variable of a given type can be used to store objects of that type. The dynamic typing approach is much more flexible. It is made possible by the fact that Scheme does not store data in variables but only stores references to data.
Scheme variables do not have types. They are just symbolic names and
you can bind any type of object to them. The type information is
carried in the objects themselves. For example, the object
127 is of the type integer, the object
"Hello, World!" is of the type string,
and the object (a b c) is of the type
pair. A value that carries its type information with it is
also called a boxed value.
Many procedures accept only arguments of one specific
type. For instance, the reverse procedure expects a
list:
(reverse '(a b c)) => (c b a)
Passing any other type to the procedure will make it fail:
(reverse 'non-list) => bottom
The actual behavior of the reverse procedure
depends on the Scheme implementation you are using. Most versions
will tell you that you applied a procedure expecting a list to
a symbol or print a similarly informative message. In an environment
where reverse is a user-level procedure, though, the system
may not perform any type checking at all when reverse is
applied to an argument. When the procedure is applied to a non-list,
it simply fails as soon as the first primitive is hit. In such an
environment the reverse procedure may look like this:
(define (reverse a)
(letrec
((reverse2
(lambda (a b)
(cond ((null? a) b)
(else (reverse2 (cdr a)
(cons (car a) b)))))))
(reverse2 a '())))
When this procedure is applied to the symbol non-list, the following happens:
(reverse 'non-list)
-> (reverse2 non-list ())
-> (cond ((null? non-list) b)
(else (reverse2 (cdr non-list)
(cons (car non-list) b))))
-> (reverse2 (cdr non-list)
(cons (car non-list) b))
At this point, Scheme attempts to take the cdr of a symbol:
(cdr non-list) => bottom
and because Scheme procedures are strict, reverse also
reduces to bottom. The system will then print a message like this:
reverse2: cdr: expected a pair, but got: non-list
It tells you that the application of cdr to an object
that is not a pair was attempted. The system also tells you that this
happened inside of the reverse2 procedure and that the
offending object was the symbol non-list.
It is a common practice in Scheme to let procedures fail like this.
Of course, you could intercept such type errors by checking
the argument using a predicate like list?, but all this
would buy you was a slightly improved error message. This is why
Scheme programmers normally do not bother with adding such
redundant checks.
Note that adding type checks does not improve safety, because
all primitive procedures of Scheme perform type checking internally. This
is why the cdr procedure finally caught the bad type of
the argument of reverse. All primitives abort evaluation
immediately when an object of the wrong type is passed to them, and
because all procedures are ultimately composed of primitives,
a wrong type sooner or later will hit such a primitive. The following
example will illustrate this principle a bit more. Let us assume that the
char<=?
predicate was a user-level procedure. It would probably be defined in
such a way:
(define (char<=? a b) (not (char>? a b)))
It is based on the
char>?
predicate, which could be a user-level procedure, too:
(define (char>? a b) (char<? b a))
Char<?, finally, is a primitive procedure which expects
two arguments of the type char. Now imagine a procedure called
hex-digit? which is based upon char<=?:
(define (hex-digit? x)
(or (char<=? #\0 x #\9)
(char<=? #\a x #\f)))
When a non-char is passed to hex-digit?, evaluation
continues until the primititive procedure char<? is
applied to it. First Hex-digit? calls char<=?,
which calls char>?, which in turn calls
char<?. Char<?, finally, detects the
wrong type and reduces to bottom.
But what happens if a type error is never caught at all? Type errors are feared by programmers all over the world. Once more the answer is: relax. A type error that is not caught in Scheme is not a type error at all. Either an object hits a primitive that does not know how to handle it, or the evaluation terminates normally and delivers a proper normal form.
In fact, there have been quite a few procedures in this book
that do not depend on specific types. For example, the cons
procedure can be used to glue together objects of any type:
(cons 'a #f) => (a . #f)
(cons "foo" 123) => ("foo" . 123)
(cons #\x '(25)) => (#\x 25)
The eq? procedure also accepts arguments of any type,
although it does not always reduce to a valid result. This is not
a type error, though, but a deliberate design decision. If the set
of types accepted by eq? was restricted, eq?
would not work, as the following examples show:
(define (null? x) (eq? x '())) (define (not x) (eq? x #f))
The above definitions of null? and not
fully comply with the Scheme standard (but they are mostly implemented
as primitives). Instead of saying that eq? allows type
errors, it should be considered a procedure that does both
type checking and checking for identity, because
(eq? x y) => #t
implies that x and y have the same type.
There are many other procedures and pseudo functions that accept
any type, like quote and pair?, but one
of these procedures is particularly interesting: equal?.
It is interesting because it makes use of type predicates. A type
predicate is used to check whether a given object has a given type.
Pair?, for
instance, returns truth only if its argument is a pair.
Of course, type predicates have to accept any type of argument
themselves, or they could not work. The following type
predicates exist in Scheme:
boolean? char? number? pair? port? procedure? string? symbol? vector?
Any data object of the Scheme languages satisfies exactly
one of the above predicates. No object may have more
than one type and all types are different. This principle is
called
disjointness of types.
In case you wonder why the
list?
predicate is not in the above list: The list is a special case
of the pair, so it is sufficient to include the pair in the
list of fundamental types. Furthermore, adding a special case
to the list of fundamental types would break disjointness
of types because
(list? '(a)) => #t (pair? '(a)) => #t
The examples given in this section show that dynamic typing provides a high level of flexibility and a high level of safety at the same time. It is particularly suitable for rapid prototyping, because it does not force the programmer to make decisions about types where no such decision is required. Statically typed languages that employ strong typing (like Pascal) often lack flexibility, and languages employing weak typing (like C) typically sacrifice safety for flexibility. Dynamic typing combines the safety of strong static typing with the flexibility of weak typing.
Another thing most programmers are concerned about is efficiency,
and boxed values decrease performance, because they do not fit in
registers. However, optimizing Scheme compilers can decide to
unbox
values in contexts where their type information is
not needed. For example, when compiling the distance-between-points
procedure introduced earlier, a compiler may find out that the type
information of the arguments is not used:
(define (distance-between-points x y x2 y2)
(let ((dx (abs (- (unbox x2) (unbox x))))
(dy (abs (- (unbox y2) (unbox y)))))
(box (sqrt (+ (* dx dx)
(* dy dy))))))
Therefore it is sufficient to unbox the arguments when the procedure is entered and re-box the result before returning it. Calls to the fictitious procedures box and unbox have been added to the code above in order to illustrate this principle. An optimizing compiler would add code for boxing and unboxing values on its own. The values of the variables dx and dy do not have to be boxed at all, because they are local to the scope of the procedure. Hence all intermediate results can be kept in registers inside of distance-between-points.
By unboxing values and limiting type checking to primitive procedures, the overhead is reduced to a degree where there is not much of a difference between optimized Scheme code and optimized code generated by a compiler for a statically typed language. On the other hand, this small overhead buys a high degree of freedom and flexibility.
| Previous Section | - Contents - Index - | Next Section |