Sketchy LISP |
By Nils M Holm,
2006,2007,2008 Buy a copy at Lulu.com |
An Introduction to Functional Programming in Scheme
| Previous Section | - Contents - Index - | Next Section |
Up to this point only two data types have been introduced: the integer and the boolean. There are more types in Scheme, though. Some of them will be introduced in this section.
The integer is the only numeric type that is required to be supported by all Scheme implementations. Most implementations also provide floating point, rational and complex numbers, but these are not discussed in detail here. While Scheme uses a highly interesting and mathematically correct model for numeric computations involving floating point arithmetics, most of the numeric code in this book deals with integers.
One interesting property of Scheme arithmetics is that computations do not overflow. There are still some popular languages that will silently provide a wrong result for expressions like
1234567890 + 9876543210
For instance, the following C program will print -1773790788 (on a two's-complement 32-bit machine), which is plain wrong:
#include <stdio.h>
int main(void) {
printf("%ld\n", 1234567890 + 9876543210);
return 0;
}
Of course, modern computers have 64-bit integers, but there will always be a number that is too big to compute in a given number of bits, and there will always be a language which computes the wrong result without notice.
Scheme uses so-called
bignum arithmetics.
Bignum
is short for big number
. The
implementation of this kind of arithmetics is independent from a
specific kind of hardware. It can handle all numbers that fit in
memory, so the only sort of overflow that theoretically can occur
while doing bignum arithmetics is the memory overflow, and
even such an event will reliably abort the computation in progress
and not deliver a wrong result.
Some versions of Scheme use bignum arithmetics exclusively,
but more efficient implementations use native arithmetic
operations as long as the result fits in a machine word. When an
overflow occurs, they switch to bignum mode. To the user there
is no visible difference. Bignum integers look exactly the same
as native
integers. As long as the result of an operation
fits in a machine word, Scheme is about as fast as any other
language. When an overflow occurs, it does slow down, but still
produces correct results.
The boolean is used to represent logical truth and falsity. There are only two literals representing theses states, which is why this sub-section is rather short.
The object #t represents
logical truth (or just truth
) and
#f represents logical falsity
(falsity
, the false
value). There is only one false
value, but any object except for #f is considered a
true value. This is why the following expression works fine:
(cond (0 'foo)) => foo
Do not worry about the sub-expression 'foo - it will
be explained real soon now. What is more interesting is the fact that
0 is considered to be true. This might look odd at a first glance, but
it simplifies some code significantly.
For example, the
member procedure
searches a given object in a list. When it finds a matching member, it
returns a sublist with that member at the first position. When it does
not find a match, it returns #f. Because all
non-#f values are considered true, you can write code
like this:
(define (digit-type x)
(cond ((member x '(1 3 5 7 9)) 'odd)
((member x '(0 2 4 6 8)) 'even)
(else 'not-a-number)))
Like most other languages, Scheme provides data types for holding single characters and sequences of characters. Single characters are represented by chars, and sequences of characters are represented by strings.
At this point, it is time to introduce something called external representation. The external representation of an object is what the programmer keys in in order to create an object of a given type. For instance, 123 is typed to create a numeric object with a value of 123. Of course, the computer does not really store the number 123 in this form, but it converts it to some internal representation that is more efficient from the machine's point of view.
Because the internal representation of 123 would be something like
01111010, which is hard to decode for a human being, each Scheme object
maps to exactly one single external representation. Numbers map to
their decimal forms, logical falsity maps to #f, etc.
The Scheme procedures
read and
write
translate between internal and external representation. Read
maps external representation to the internal one:
(read) 12345 => 12345
Because Scheme environments use write to output their
results, though, you cannot see the internal representation. In fact,
the internal representation is never of any interest to Scheme
programmers. You do not have to care about things like byte ordering,
memory layout, etc. Scheme hides these details from you and allows
you to concentrate on real problems instead.
The external representation of a char consists of the prefix #\ and the character to represent, for example:
#\H #\e #\l #\l #\o #\\ #\( #\5 #\+ #\
The last char in the above sample represents the blank character,
but this representation is unfortunate, because the character to
represent is invisible. This is why an alternative external representation
of this character exists:
#\space.
The same is valid for the newline character, which is written
#\newline.
A sequence of characters is called a string. The characters of a string are enclosed in double quotes:
"Hello, World!"
To include a quotation mark in a string, it must be prefixed with a backslash:
"He said \"hello\"."
To include a backslash, it must be prefixed with another backslash.
The external representation of a string requires the backslash before
the characters #\" and #\\. Therefore,
"\"Hi\"" => "\"Hi\""
If you want to output a string without the additional backslashes,
you have to use the pretty-printing
display
procedure:
(display "A\\B") writes A\B
Note that the output of display is not an
external representation. When discussing Scheme, the external
representation is used exclusively to refer to objects. Neither the
sequence A\B nor the sequence "A\B" is a
valid object of the type string.
In fact I was lying when I told you that you only had learned about two data types at the beginning of this section. Actually, it were four, because programs like
(define (fact n)
(cond ((zero? n) 1)
(else (* n (fact (- n 1))))))
are nothing but lists of integers, booleans, other lists, and symbols. So you already have seen quite a few lists and symbols in this book. You even may have understood how lists work by using your intuition. A simple explanation would be that a list is a sequence of objects enclosed by parentheses, like these:
(1 2 3) (#t "hi" (+ 5 7) #\x) (cond (a b) (#t #f)) (define (id x) x) () ; <--- this is the empty list
These examples ilustrate two notable properties: The first one is that a list may contain lists. Lists may be nested to any depth. The second one is that Scheme programs are in fact lists. This is a particularly nice feature, because it allows Scheme programs to examine, decompose, and even write Scheme programs in a simple way, just by using procedures for manipulating lists. On the other hand, this may seem troublesome, because procedure applications look like lists, too:
(+ 5 7)
Is this supposed to be a list of three elements or an application
of the + procedure to the numbers 5 and 7? At this point,
quotation comes into play. Above
expression is not quoted, so it is a procedure application:
(+ 5 7) => 12
To create a list containing the members +, 5, and 7,
the quote pseudo
function is used:
(quote (+ 5 7)) => (+ 5 7)
Quote is a very important and frequently
used construct and albeit it appears to do nothing more than return
its argument without any further evaluation, you should read about
its implications carefully. Understanding quote
is absolutely essential in order to understand Scheme.
Another type that frequently has to be quoted is the symbol. While symbols are common in most programming languages, quoted symbols are only present in a minority of them. As long as a symbol occurs without quotation in a Scheme program, it has exactly the same meaning as in most other languages: either it is a variable representing a value or it is a keyword.
Scheme has keywords, too. You already have seen a few in this book,
namely define, cond, if,
letrec, and lambda. Most other symbols you have
seen were in fact variables, including, for instance, zero?,
fact, and +.
As you can see, the naming conventions for symbols are rather lax in Scheme. You may include letters, numbers, and even a wide range of special characters, like + - * / : < > ? and others.
Characters you may never use in symbols are these:
( ) [ ] . ' ` , # " ;
You may not use sequences that could be mixed up with numbers, either (like 123, -1, etc).
The define pseudo function introduces a variable,
binds it to a storage location, and
makes that location refer to the normal form of an expression. In
most contexts, it is also correct to say that it binds a variable
to a value
.
Note that variables do not have types in Scheme, so this is different from storing a value in a typed variable. When a value is stored in a typed variable, like in a C program, the variable must have the right type to hold that value. You cannot store a C string in a C int. A Scheme variable, however, is merely a reference to a value. By associating a variable with a value, you make the variable refer to that value.
Once a variable is associated with with a value, each reference to the variable evaluates to that value:
(define foo 5) foo => 5 (+ foo foo) => 10 (+ foo (* foo foo)) => 30
It is an error to refer to a symbol that is not bound to any value:
unbound-symbol => bottom
Unlike most other popular languages, Scheme can refer to symbols themselves. In order to refer to a symbol rather than its value, the symbol is quoted:
foo => 5 (quote foo) => foo
Even unbound symbols can be referred to in this way:
(quote unbound-symbol) => unbound-symbol
To avoid misunderstandings, the term symbol will be used
to denote quoted symbols and unbound symbols in this book. Unquoted
and bound symbols will be referred to as variables or arguments or,
if they are bound to procedures, even as procedures, as in
the procedure f takes two arguments
a and b
.
An alternative notation for quoting expressions is:
'foo => foo
This is because typing (quote something) each
time you want to refer to a symbol rather than its value or create
a list rather than apply a procedure is a bit tedious. The two notations
are perfectly equivalent:
'expression = (quote expression)
(Quoted) symbols do not have any value other than their name. They cannot be composed, decomposed, or manipulated in any other way. All you can do with them is to convert them to strings:
(symbol->string 'foo) => "foo"
or compare them:
(eq? 'foo 'foo) => #t (eq? 'foo 'bar) => #f (define baz 'foo) (eq? 'foo baz) => #t
The
eq?
predicate tests whether its two arguments are
identical. The term identical
is much stronger than the terms equal
or equivalent
. It
states that two appearances of an object are in fact one and the
same. All occurrences of one symbol are identical.
Applying eq? to two appearances of the same symbol always
yields truth. Symbols, the empty list (), and the
truth literals #t and #f are the only objects
whose occurrences are identical.
Although there are many Scheme programs that do not make any use of symbols, they are essential to the language, because Scheme programs themselves are made of lists and symbols. For example, the expression
'(define (fact n)
(cond ((zero? n) 1)
(else (* n (fact (- n 1))))))
reduces to a list that resembles a Scheme program. Note that the
quote character (') at the beginning quotes the
entire expression. The expression consists
of (lists containing) truth literals, integers, and lots of symbols.
You can use list procedures to decompose the program and examine its
components. For example, you can extract the first member of the list,
define, and store it in a variable x. You can
then use the expression
(eq? x 'define)
to find out whether the program defines something. This is how Scheme programs can examine Scheme programs. The necessary procedures for composing and decomposing lists will be introduced in the following section.
| Previous Section | - Contents - Index - | Next Section |