Certain types like Integers, Booleans, Characters, Reals, Lists, Tuples and Arrays are that frequently used that they have been predefined in CLEAN for reasons of efficiency and/or notational convenience. These types and the syntactic sugar that has been added to create and to inspect (via pattern matching) objects of these popular types are treated in this chapter.
PredefinedType | = | BasicType | // see 4.1 | |||
| | ListType | // see 4.2 | ||||
| | TupleType | // see 4.3 | ||||
| | ArrayType | // see 4.4 | ||||
| | ArrowType | // see 4.6 | ||||
| | PredefType | // see 4.7 |
In Chapter 5 we will explain how new types can be defined.
Basic types are algebraic types (see 5.1) which are predefined for reasons of efficiency and convenience: Int (for 32 or 64 bits integer values), Real (for 64 bit double precision floating point values), Char (for 8 bits ASCII character values) and Bool (for 8 bits Boolean values). For programming convenience special syntax is introduced to denote constant values (data constructors) of these predefined types. Functions to create and manipulate objects of basic types can be found in the CLEAN StdEnv library (as indicated below).
There is also a special notation to denote a string (an unboxed array of characters, see 4.4) as well as to denote a list of characters (see 4.2.1).
BasicType | = | Int | // see StdInt.dcl | |||
| | Real | // see StdReal.dcl | ||||
| | Char | // see StdChar.dcl | ||||
| | Bool | // see StdBool.dcl |
In a graph expression a constant valueof basic type Int, Real, Bool or Char can be created.
BasicValue | = | IntDenotation | ||
| | RealDenotation | |||
| | BoolDenotation | |||
| | CharDenotation |
IntDenotation | = | [Sign]{Digit}+ | // decimal number | |||
| | [Sign]0{OctDigit}+ | // octal number | ||||
| | [Sign]0x{HexDigit}+ | // hexadecimal number | ||||
Sign | = | + | - | ||||
RealDenotation | = | [Sign]{Digit}+.{Digit}+[E[Sign]{Digit}+] | ||||
BoolDenotation | = | True | False | ||||
CharDenotation | = | CharDel AnyChar/CharDel CharDel | ||||
CharsDenotation | = | CharDel {AnyChar/CharDel}+ CharDel |
AnyChar | = | IdChar | ReservedChar | SpecialChar | ||||||||||||||||||||||||||||||||||||||||||||||
ReservedChar | = | ( | | | ) | | | { | | | } | | | [ | | | ] | | | ; | | | , | | | . | ||||||||||||||||||||||||||||||
SpecialChar | = | \n | | | \r | | | \f | | | \b | // newline,return,formf,backspace | |||||||||||||||||||||||||||||||||||||||
| | \t | | | \\ | | | \CharDel | // tab,backslash,character delimiter | ||||||||||||||||||||||||||||||||||||||||||
| | \StringDel | // string delimiter | ||||||||||||||||||||||||||||||||||||||||||||||
| | \{OctDigit}+ | // octal number | ||||||||||||||||||||||||||||||||||||||||||||||
| | \x{HexDigit}+ | // hexadecimal number |
Digit | = | 0 | | | 1 | | | 2 | | | 3 | | | 4 | | | 5 | | | 6 | | | 7 | | | 8 | | | 9 | ||||||||||||||||||||
OctDigit | = | 0 | | | 1 | | | 2 | | | 3 | | | 4 | | | 5 | | | 6 | | | 7 | ||||||||||||||||||||||||
HexDigit | = | 0 | | | 1 | | | 2 | | | 3 | | | 4 | | | 5 | | | 6 | | | 7 | | | 8 | | | 9 | ||||||||||||||||||||
| | A | | | B | | | C | | | D | | | E | | | F | |||||||||||||||||||||||||||||
| | a | | | b | | | c | | | d | | | e | | | f |
CharDel | = | ' | ||
StringDel | = | " |
A constant value of predefined basic type Int, Real, Bool or Char (see 4.1) can be specified as pattern.
The denotation of such a value must obey the syntactic description given in above. |
A listis an algebraic data type predefined just for programming convenience. A list can contain an infinite number of elements. All elements must be of the same type. Lists are very often used in functional languages and therefore the usual syntactic sugar is provided for the creation and manipulation of lists (dot-dot expressions, list comprehensions) while there is also special syntax for a list of characters.
Lists can be lazy (default), and optionally be defined as head strict, spine strict, strict (both head and spine strict), head strict unboxed, and strict unboxed. Lazy, strict and unboxed lists are all objects of different: type. All these different types of lists have different time and space properties ( see 10.1.3). Because these lists are of different type, conversion functions are needed to change e.g. a lazy list to a strict list. Functions defined on one type of a list cannot be applied to another type of list. However, one can define overloaded functions that can be used on any list: lazy, strict as well as on unboxed lists.
ListType | = | [[ListTypeKind] Type [SpineStrictness]] | ||||
ListTypeKind | = | ! | // head strict list | |||
| | # | // head strict, unboxed list | ||||
SpineStrictness | = | ! | // tail (spine) strict list |
All elements of a list must be of the same type. |
Because lists are very convenient and very frequently used data structures, there are several syntactical constructs in CLEAN for creating lists, including dot-dot expressions and ZF-expressions. Since CLEAN is a lazy functional language, the default list in CLEAN is a lazy list. However, in some cases strict lists, spine strict lists and unboxed lists can be more convenient and more efficient.
List | = | ListDenotation | ||
| | DotDotExpression | |||
| | ZF-expression |
All elements of a list must be of the same type. |
ListDenotation | = | [[ListKind] [{LGraphExpr}-list [: GraphExpr]] [SpineStrictness] ] | ||
LGraphExpr | = | GraphExpr | ||
| | CharsDenotation |
CharsDenotation | = | CharDel {AnyChar/CharDel}+ CharDel | ||
CharDel | = | ' |
One way to create a list is by explicit enumeration of the list elements. Lists are constructed by adding one or more elements to an existing list.
A special notation is provided for the frequently used list of characters.
ListKind | = | ! | // head strict list | |||
| | # | // unboxed list | ||||
| | | | // overloaded list | ||||
SpineStrictness | = | ! | // spine strict list |
In CLEAN any data structure can be made (partially) strict or unboxed (see 10.1). This has consequences for the time and space behavior of the data structure.
For instance, lazy lists are very convenient (nothing is evaluated unless it is really needed for the computation, one can deal with infinite data structures), but they can be inefficient as well if actually always all lists elements are evaluated sooner or later. Strict lists are often more efficient, but one has to be certain not to trigger a not used infinite computation. Spine strict lists can be more efficient as well, but one cannot handle infinite lists in this way. Unboxed lists are head strict. The difference with a strict list is that the representation of an unboxed list is more compact: instead of a pointer to the lists element the list element itself is stored in the list. However, unboxed lists have as disadvantage that they can only be used in certain cases: they can only contain elements of basic type, records and tuples. It does not make sense to offer unboxing for arbitrary types: boxing saves space, but not if Cons nodes are copied often: the lists elements are copied as well while otherwise the contents could remain shared using the element pointer instead.
In terms of efficiency it can make quite a difference (e.g. strict lists can sometimes be 6 times faster) which kind of list is actually used. But, it is in general not decidable which kind of list is best to use. This depends on how a list is used in a program. A wrong choice might turn a program from useful to useless (too inefficient), from terminating to non-terminating.
Because lists are so frequently used, special syntax is provided to make it easier for a programmer to change from one type of list to another, just by changing the kind of brackets used. One can define a list of which the headelement is strict but the spine is lazy (indicated by [! ]), a list of which the spine is strict but the head element is lazy (indicated by [ !]) and a completely evaluated list (indicated by [! !]). One can have an unboxed list with a strict head element (indicated by [# ]) and a completely evaluated unboxed list of which in addition the spine is strict as well (indicated by [# !]
Note that all these different lists are of different type and consequently these lists cannot be mixed and unified with each other. With conversion functions offered in the CLEAN libraries it is possible to convert one list type into another. It is also possible to define an overloaded list and overloaded functions that work on any list (see hereafter).
Unboxed data structures can only contain elements of basic type, records and arrays. |
One can create an overloaded list that will fit on any type of list (lazy, strict or unboxed)
Other ways to create lists are via dot-dot expressions and list comprehensions.
DotDotExpression | = | [[ListKind] GraphExpr [,GraphExpr]..[GraphExpr] [SpineStrictness] ] |
With a dot-dot expression the list elements can be enumerated by giving the first element (n1), an optional second element (n2) and an optional last element (e).
The generated list is in general calculated as follows:
from_then_to:: !a !a !a -> .[a] | Enum a
from_then_to n1 n2 e
| n1 <= n2 = _from_by_to n1 (n2-n1) e
= _from_by_down_to n1 (n2-n1) e
where
from_by_to n s e
| n<=e = [n : _from_by_to (n+s) s e]
= []
from_by_down_to n s e
| n>=e = [n : _from_by_down_to (n+s) s e]
= []
The step size is one by default. If no last element is specified an infinite list is generated.
With a dot-dot expression one can also define a lazy list, a strict list, an unboxed list or an overloaded list.
Dot-dot expression can only be used if one imports StdEnum from the standard library. | |
Dot-dot expressions are predefined on objects of type Int, Real and Char, but dot-dots can also be applied to any user defined data structure for which the class enumeration type has been instantiated (see CLEANs Standard Environment). |
ZF-expression | = | [[ListKind] GraphExpr \\ {Qualifier}-list [SpineStrictness]] | ||||
Qualifier | = | Generators {, let { {LocalDef}+ } } {|Guard} | ||||
Generators | = | Generator {& Generator} | ||||
Generator | = | Selector <- ListExpr | // select from a lazy list | |||
| | Selector <|- ListExpr | // select from an overloaded list | ||||
| | Selector <-: ArrayExpr | // select from an array | ||||
Selector | = | BrackPattern | // for brack patterns see 3.2 | |||
ListExpr | = | GraphExpr | ||||
ArrayExpr | = | GraphExpr | ||||
Guard | = | BooleanExpr | ||||
BooleanExpr | = | GraphExpr | ||||
ListKind | = | ! | // head strict list | |||
| | # | // unboxed list | ||||
| | | | // overloaded list | ||||
SpineStrictness | = | ! | // spine strict list |
With a list generator called a ZF-expression one can construct a list composed from elements drawn from other lists or arrays. With a list generatorone can draw elements from lazy list. The symbol '<-' is used to draw elements from a lazy list, the symbol '<|-' can be used to draw elements from any (lazy, strict, unboxed) list. With an array generator (use symbol '<-:') one can draw elements from any array. One can define several generators in a row separated by a comma. The last generator in such a sequence will vary first. One can also define several generators in a row separated by a '&'. All generators in such a sequence will vary at the same time but the drawing of elements will stop as soon of one the generators is exhausted. This construct can be used instead of the zip-functions that are commonly used. Selectors are simple patterns to identify parts of a graph expression. They are explained in Section 3.6. Only those lists produced by a generator that match the specified selector are taken into account. Guards can be used as filter in the usual way.
The scope of the selector variables introduced on the left-hand side of a generator is such that the variables can be used in the guards and other generators that follow. All variables introduced in this way can be used in the expression before the \\ (see the picture below).
[ expression \\ selector <- expression
| guard
, selector <- expression
| guard
]
Local definitions can be introduced in a ZF expression after generator(s) using , let:
Qualifier | = | Generators { , let { {LocalDef}+ } } {|Guard} |
The variables introduced in this way can be used in the guard and local definitions, other qualifiers that follow, and the expression before the \\.
An object of the predefined algebraic type list can be specified as pattern in a function, case expression or list generator.
ListPattern | = | [[ListKind][{LGraphPattern}-list [: GraphPattern]] [SpineStrictness]] | ||||
LGraphPattern | = | GraphPattern | ||||
| | CharsDenotation | |||||
ListKind | = | ! | // head strict list | |||
| | # | // unboxed head strict list | ||||
| | | | // overloaded list | ||||
SpineStrictness | = | ! | // spine strict list |
Notice that only simple list patterns can be specified (one cannot use a dot-dot expression or list comprehension to define a list pattern).
A tuple is an algebraic data type predefined for reasons of programming convenience and efficiency. Tuples have as advantage that they allow bundling a finite number of objects of arbitrary type into a new object without being forced to define a new algebraic type for such a new object. This is in particular handy for functions that return several values.
Tuples can be created that can be used to combine different (sub-)graphs into one data structure without being forced to define a new type for this combination. The elements of a tuple need not be of the same type. Tuples are in particular handy for functions that return multiple results.
Tuple | = | (GraphExpr,{GraphExpr}-list) |
One can turn a lazy tuple element into a strict one by putting strictness annotations in the corresponding type instance on the tuple elements that one would like to make strict. When the corresponding tuple is put into a strict context the tuple and the strict annotated tuple elements will be evaluated. As usual one has to take care that these elements do not represent an infinite computation.
Strict and lazy tuples are regarded to be of different type. However, unlike is the case with any other data structure, the compiler will automatically convert strict tuples into lazy ones, and the other way around. This is done for programming convenience. Due to the complexity of this automatic transformation, the conversion is done for tuples only! For the programmer it means that he can freely mix tuples with strict and lazy tuple elements. The type system will not complain when a strict tuple is offered while a lazy tuple is required. The compiler will automatically insert code to convert non-strict tuple elements into a strict version and backwards whenever this is needed.
An object of the predefined algebraic type tuple can be specified as pattern in a function, case expression or list generator.
TuplePattern | = | (GraphPattern,{GraphPattern}-list) |
An array is an algebraic data type predefined for reasons of efficiency. Arrays contain a finite number of elements that all have to be of the same type. An array has as property that its elements can be accessed via indexing in constant time. An array index must be an integer value between 0 and the number of elements of the array-1. Destructive update of array elements is possible thanks to uniqueness typing. For programming convenience special syntax is provided for the creation, selection and updating of array elements (array comprehensions) while there is also special syntax for strings (i.e. unboxed arrays of characters). Arrays have as disadvantage that their use increases the possibility of a runtime error (indices that might get out-of-range).
To obtain optimal efficiency in time and space, arrays are implemented different depending on the concrete type of the array elements. By default an array is implemented as a lazy array (type {a}), i.e. an array consists of a contiguous block of memory containing pointers to the array elements. The same representation is chosen if strict arrays (define its type as {!a}) are being used. The difference is that all elements of a strict array will be evaluated if the array is evaluated. It makes no sense to make a distinction between head strictness and tail strictness for arrays as is the case for lists. As usual one has to take care that the strict elements do not represent an infinite computation.
For elements of basic type, record type and tuple type an unboxed array (define its type as {#a}) can be used. In that latter case the pointers are replaced by the array elements themselves. Lazy, strict and unboxed arrays are considered to be objects of different type. However, most predefined operations on arrays are overloaded such that they can be used on lazy, on strict as well as on unboxed arrays.
ArrayType | = | {[ArrayKind] Type} | ||||
ArrayKind | = | ! | // strict array | |||
| | # | // unboxed array |
An array is a tuple/record-like data structure in which all elements are of the same type. Instead of selection by position or field name the elements of an array can be selected very efficiently in constant time by indexing. The update of arrays is done destructively in CLEAN and therefore arrays have to be unique (see Chapter 9) if one wants to use this feature. Arrays are very useful if time and space consumption is becoming very critical (CLEAN arrays are implemented very efficiently). If efficiency is not a big issue we recommend not to use arrays but to use lists instead: lists induce a much better programming style. Lists are more flexible and less error prone: array elements can only be accessed via indices and if you make a calculation error, indices may point outside the array bounds. This is detected, but only at run-time. In CLEAN, array indices always start with 0. More dimensional arrays (e.g. a matrix) can be defined as an array of arrays.
For efficiency reasons, arrays are available of several types: there are lazy arrays (type {a}), strict arrays (type {!a}) and unboxed arrays (e.g. type{#Int}). All these arrays are considered to be of different type. By using the overloading mechanism (type constructor classes) one can still define (overloaded) functions that work on any of these arrays.
Array | = | ArrayDenotation | ||
| | ArrayUpdate | |||
| | ArrayComprehension |
All elements of an array need to be of same type. |
A new array can be created in a number of ways. A direct way is to simply list the array elements.
ArrayDenotation | = | {[ArrayKind] {GraphExpr}-list} | ||
| | StringDenotation |
StringDenotation | = | StringDel{AnyChar/StringDel}StringDel | ||
StringDel | = | " |
By default this array denotation is overloaded. The type determines whether a lazy, strict or unboxed array is created. The created array is unique (the * or . attribute in front of the type, see Chapter 9) to make destructive updates possible.
A lazy array is a box with pointers pointing to the array elements. One can also create a strict array by adding a ! after { (or explicitly define its type as {!Int}), which will have the property that the elements to which the array box points will always be evaluated. One can furthermore create an unboxed array by adding a # (or explicitly define its type as {#Int}), which will have the property that the evaluated elements (which have to be of basic value) are stored directly in the array box itself. Clearly the last one is the most efficient representation (see also Chapter 10).
To make it possible to use operators such as array selection on any of these arrays (of actually different type) a multi parameter type constructor class has been defined (in StdArray) which expresses that "some kind of array structure is created". The compiler will therefore deduce the following general type:
Array :: .(a Int)| Array a Int
Array = {1,3,5,7,9}
A string is predefined type which equivalent to an unboxed array of characters {#Char}. Notice that this array is not unique, such that a destructive update of a string is not allowed. There is special syntax to denote strings.
There are a number of handy functions for the creation and manipulation of arrays predefined in CLEAN 's Standard Environment. These functions are overloaded to be able to deal with any type of array. The class restrictions for these functions express that "an array structure is required" containing "an array element".
It is also possible to construct a new array out of an existing one (a functional array update).
ArrayUpdate | = | { ArrayExpr & {ArrayIndex {Selection} = GraphExpr}-list [\\ {Qualifier}-list]} | ||
ArrayComprehension | = | {[ArrayKind] GraphExpr \\ {Qualifier}-list} | ||
Selection | = | .FieldName | ||
| | .ArrayIndex | |||
ArrayExpr | = | GraphExpr |
Left from the & (a & [i] = v is pronounced as: array a with for a.[i] the value v. The old array has to be specified which has to be of unique type to make destructive updating possible. On the right from the & those array elements are listed in which the new array differs from the old one. One can change any element of the array or any field or array element of a record or array stored in the array. The &-operator is strict in its arguments.
An array expression must be of type array. | |
The array expression to the left of the update operator '&' should yield an object of type unique array. | |
An array index must be an integer value between 0 and the number of elements of the array-1. An index out of this range will result in a run-time error. | |
Unboxed arrays can only be of basic type, record type or array type. | |
A unique array of any type created by an overloaded function cannot be converted to a non-unique array. |
Important: For reasons of efficiency we have defined the updates only on arrays which are of unique type (*{..}), such that the update can always be done destructively (!) which is semantically sound because the original unique array is known not to be used anymore.
One can use an array comprehension to list these elements compactly in the same spirit as with a list comprehension (see 4.2.1).
Array comprehensions can be used in combination with the update operator. Used in combination with the update operator the original uniquely typed array is updated destructively. The combination of array comprehensions and update operator makes it possible to selectively update array elements on a high level of abstraction.
Array comprehensions used without update operator automatically generate a whole new array. The size of this new array will be equal to the size of the first array or list generator from which elements are drawn. Drawn elements that are rejected by a corresponding guard result in an undefined array element on the corresponding position.
variable = {variable & updates} after # or #! can be abbreviated to variable & updates, by omitting = {variable and }.
| | Variable & {ArrayIndex {Selection} = GraphExpr}-list [\\ {Qualifier}-list] ; |
For example
# a & [i] = x
Multiple indices and fields are also possible, for example: (for record updates see 5.2.1 below)
# r & a.[i].x = y
ArraySelection | = | ArrayExpr.ArrayIndex {Selection} | ||
| | [ArrayExpr!ArrayIndex {Selection} | |||
ArrayIndex | = | [{IntegerExpr}-list] | ||
IntegerExpr | = | GraphExpr | ||
Selection | = | .FieldName | ||
| | .ArrayIndex |
With an array selection (using the '.' symbol) one can select an array element. When an object a is of type Array, the ith element can be selected (computed) via a.[i]. Array selection is left-associative: a.[i,j,k] means ((a.[i]).[j]).[k]. A "unique" selection using the '!' symbol returns a tuple containing the demanded array element and the original array. This type of array selection can be very handy for destructively updating of uniquely typed arrays with values that depend on the current contents of the array. Array selection binds more tightly (priority 11) than application (priority 10).
An object of type array can be specified as pattern. Notice that only simple array patterns can be specified on the left-hand side (one cannot use array comprehensions). Only those array elements which contents one would like to use in the right-hand side need to be mentioned in the pattern.
All array elements of an array need to be of same type. | |
An array index must be an integer value between 0 and the number of elements of the array-1. Accessing an array with an index out of this range will result in a run-time error. |
ArrayPattern | = | {{GraphPattern}-list} | ||
| | {{ArrayIndex = Variable}-list} | |||
| | StringDenotation |
It is allowed in the pattern to use an index expression in terms of the other formal arguments (of type Int) passed to the function to make a flexible array access possible.
Predefined types can also be used in curried way. To make this possible a type constructor has been predefined for all predefined types of higher order kind (see also 5.1.2). The kind X stands for any so-called first-order type: a type expecting no further arguments ((Int, Bool, [Int], etcetera). All function arguments are of kind X. The kind X -> X stands for a type that can be applied to a (first-order) type, which then yields another first-order type, X -> X -> X expects two type arguments, and so on.
Int, Bool, [Int], Tree [Int] :: X
[], Tree, (,) Int, (->) a, {} :: X -> X
(,), (->) :: X -> X -> X
(,,) :: X -> X -> X -> X
PredefinedTypeConstructor | = | [] | // list type constructor | |||
| | [! ] | // head strict list type constructor | ||||
| | [ !] | // tail strict list type constructor | ||||
| | [!!] | // strict list type constructor | ||||
| | [#] | // unboxed head strict list type | ||||
| | [#!] | // unboxed strict list type | ||||
| | ({,}+) | // tuple type constructor (arity >= 2) | ||||
| | {} | // lazy array type constructor | ||||
| | {!} | // strict array type constructor | ||||
| | {#} | // unboxed array type constructor | ||||
| | (->) | // arrow type constructor |
The arrow type is a predefined type constructor used to indicate function objects (these functions have at least arity one). One can use the Cartesian product (uncurried version) to denote the function type (see 3.7) to obtain a compact notation. Curried functions applications and types are automatically converted to their uncurried equivalent versions.
ArrowType | = | (Type -> Type) |
Abstract data types are types of which the actual definition is hidden (see 5.4). In CLEAN the types World and File are predefined abstract data types. They are recognised by the compiler and treated specially, either for efficiency or because they play a special role in the language. Since the actual definition is hidden it is not possible to denotate constant values of these predefined abstract types. There are functions predefined in the CLEAN library for the creation and manipulation of these predefined abstract data types. Some functions work (only) on unique objects.
An object of type *World (* indicates that the world is unique) is automatically created when a program is started. This object is optionally given as argument to the Start function (see 2.3). With this object efficient interfacing with the outside world (which is indeed unique) is possible.
An object of type File or *File (unique File) can be created by means of the functions defined in StdFileIO (see CLEANs Standard Library). It makes direct manipulation of persistent data possible. The type File is predefined for reasons of efficiency: CLEAN Files are directly coupled to concrete files.
The type String is a predefined synonym type that is predefined for convenience. The type is synonym for an unboxed array of characters {#Char}.
PredefType | = | World | // see StdWorld.dcl | |||
| | File | // see StdFileIO.dcl | ||||
| | String | // synonym for {#Char} | ||||
| | () | // unit type |
The unit type is a predefined type with type constructor () (see 4.7) and constructor ().
UnitPattern | = | () | ||
UnitConstructor | = | () |