|
| Previous: 2.4.4 Structures | TOC | Index | Back | Next: 2.5.1 Signed and Unsigned Values |
This section describes the most basic elements of each T3X program, the factors which may be used to form expressions.
There are many different kinds of factors: symbols, numeric literals, character literals, string literals, tables, procedure calls, messages, and class constants. A factor may only occur in expressions and a single factor is the minimum form of an expression. Factors may be prefixed by unary operators and they may be combined using binary or ternary operators. Basically, all sorts of factors are exchangable: where one of them may occur, all others are allowed, too. The only exception is the symbol which has some additional properties which make it special. For example, symbols may be subscripted and it is possible to compute their addresses. These operations are limited to symbols. All other operations may be applied to any kind of factor, even if it makes little sense, like the multiplication of two strings (which will lead to highly environment-dependent results):
"Hello" * "World"
The evaluation of a symbol depends on its type. Variables and constants evaluate to their values, vectors and objects evaluate to their addresses. Structure names and structure member names are treated the same way as constants.
Class constants are public constants which are defined inside classes. To include a class constant in an expression, it must be prefixed with the name of the defining class and a dot:
T3X.SYSOUT
Like 'ordinary' constants, they evaluate to their values.
Numeric literals are written in decimal, hexa-decimal, or binary notation and represent their own values. A percent sign may be used to negate a number:
%123 = -123
The difference between %123 and -123 is that %123 is a factor while -123 is an expression (`minus' applied to a numeric factor). In fact, the percent sign has little meaning in T3X today, since the compiler accepts ordinary minus prefixes in constant expression contexts, too. [In early T3X versions, constant expressions were limited to single factors and therefore, the percent sign was required to define negative constant values. The %-prefix is kept for compatibility reasons.] An optimizing compiler might turn -n into %n, if n is a constant numeric factor.
Hexa-decimal notation may be used to represent a numeric value when prefixing the literal with the strings '0x' or '0X' (null, X). No space is allowed between the prefix and the hexa-decimal digits. The number 4095, for example, can be written as '0xfff' or '0xFFF'. The characters 'A' through 'F' (alternativey 'a'...'f') are used to represent the hexa-decimal digits with the values '10' to '15'. No difference is made between upper and lower case characters. The literals
0x1f 0X1f 0x1F 0X1F
all express the decimal value 31. The percent prefix may be combined with hexa-decimal factors as well.
Numbers may be expressed in binary notation by prefixing the literal with the strings '0b' or '0B' (null, B). No space is allowed between the prefix and the binary digits. Binary numbers may have a '%' prefix. The number 165, for example, could be written as '0b10101010'.
Note: The literals 0x80000000 and 0b10000000000000000000000000000000 should not be used to express the (decimal) value -2147483648. This value is not defined in T3X.
Character literals are single characters or escape sequences enclosed by single quote characters:
'a' '0' '\s' ''' '\'' '\\'
A character literal evaluates to the ASCII code of the enclosed character. An escape sequence may be used to include certain unprintable or special characters. The backslash character is used to introduce the sequence. The '\' itself and the following character will be removed and replaced with the associated special character. Note that no escape sequence is required to represent an apostrophy: '''. Besides most C-style sequences, the following translations will be performed: \e->ESC, \q->", and \s->blank. The latter has been included for readability reasons. Unlike C, T3X accepts uppercase sequences as well: \e and \E both evaluate to ESC. The escape character may be used to escape itself, thereby, losing its special. For instance, '\\' evaluates to a single literal backslash. A summary of all escape sequences is listed in the quick reference section.
String literals are sequences of characters delimited by double quotes ("):
"Hello, World!\n"
Each character either represents itself or is part of an escape sequence as described above. Each character is stored in a single byte. Each string literal is terminated with a NUL character, so n+1 characters are required to store a string of the length n.
Since a string is an array of subsequent bytes, the ::-operator may be used to access its individual characters.
At runtime, each string literal evaluates to the address of its first character.
The table is a more general form of a literal vector. A table is a static initialized vector and a generalization of BCPL-style TABLEs. Syntactically, it is a list of table members delimited by square brackets:
[ 7, "MOD", @modulo ]
Each table member occupies exactly one machine word. A string, for example, is represented by a pointer, while the string literal itself is placed outside of the table. Therefore, table members can be accessed using the subscript operator []: if
X = [ 77,88,99 ]
then
X[2]
evaluates to 99. The square bracket notation was chosen for delimiting tables because of the strong connection between vectors and the subscript operator.
The type of each table member may be any out of the following list:
Constant expressions include everyting which has a value that may be computed at compile time (like numeric literals). The inclusion of strings has been explained above. Addresses of global variables and procedures are represented by a symbol name prefixed with the address operator @.
What makes tables particulary flexible is the fact that they can be nested:
[ [ 2, 9, 4 ], [ 7, 5, 3 ], [ 6, 1, 8 ] ]
Like strings, embedded tables are stored outside of the surrounding table and included as pointers. If, for example, the above table is assigned to the symbol v, the following conditions hold:
v[0] = [ 2, 9, 4 ] v[1] = [ 7, 5, 3 ] v[2] = [ 6, 1, 8 ]
Since the result of applying a subscript operator to a nested table in turn results in a table, the subscript operator may be applied one more time, and consequently,
v[1][1]
would result in 5:
v = [ [2,4,9], [7,5,3], [6,1,8] ] v[1] = [7,5,3] v[1][1] = 5
(Remember that the first element of a vector has the index 0.)
A table which contains at least one non-constant expression is called a dynamic table. Non-constant expressions must be put in parentheses when they are to be included in a table:
v := [ "a * b = ", (a*b) ];
Embedded (non-constant) expressions are computed freshly each time the flow of the program passes the table they are contained in (each time the table is evaluated). Therefore, the values of table members computed by embedded expressions may be different each time the table is evaluated. This is why such a table is called 'dynamic'. The parentheses show the compiler that an expression is non-constant and make it generate additional code to fill in the value of the expression whenever the table is encountered. Therefore, static (constant) expressions should never be parenthesized in tables, because doing so would result in inefficient code. For example,
v := [ "5 * 7 = ", (5*7) ];
works, but computes 5*7 each time the table is evaluated. (Note: Even if an optimizing compiler would fold 5*7 to 35, the value still would have to be stored in the table each time it is passed.) On the other hand, including dynamic expressions in a table without any parentheses will lead to an error:
v := [ "a * b = ", a*b ];
will not work unless both a and b are constant. In sets of subsequent dynamic expressions
[ "sums", (a+b), (a+c), (b+c) ]
a single pair of parentheses enclosing the entire set is sufficient:
[ "sums", (a+b, a+c, b+c) ]
Tables may be prefixed with the keyword PACKED. Packed tables may only contain byte-size values. Therefore, their members are limited to the range from -128 to 255.
A string may be considered a special form of a packed table. Consequently, each string may be written as a packed table as well. For example,
"T3X"
is equal to
PACKED [ 'T', '3', 'X', 0 ]
(Note the trailing zero in the vector literal.) Both strings and packed tables will be padded with zeroes up to the next word boundary.
The maximum number of members per table may be limited, but at least 128 elements per table must be allowed by any T3X implementation. The elements contained in nested tables do not count, but the entire embedded table counts as a single member. The same limit may exist for packed tables and string literals.
Procedure calls are represented by a procedure name followed by a parentheses-enclosed list of zero or more comma-separated arguments:
find(text, "word", 0, TEXT_SIZE);
Each argument may be any valid expression. When a procedure expects zero arguments, the parentheses must still be supplied: P(). A procedure call evaluates to the return value of the called procedure.
In T3X, only procedures may be called. Calls to absolute addresses and computed calls - like in BCPL - are not allowed. There is a mechanism to perform indirect calls, though: the CALL operator.
More detailed information on procedure calls and the procedure call operators can be found in later sections.
A message is used to call a method of a class. It is sent to an instance of its class, also known as an object. The message syntax is equal to a procedure call prefixed with the name of the receiving instance and a dot:
t.write(T3X.SYSOUT, "Hello, World!\n", 14);
Details about messages can be found in the chapter on object oriented programming.
| Previous: 2.4.4 Structures | TOC | Index | Back | Next: 2.5.1 Signed and Unsigned Values |