Chapter 10

Strictness, Macros and
Efficiency

10.1

Defining Partially Strict Data Structures and Functions

10.2

Defining Graphs on the Global Level

10.3

Defining Macros

10.4

Efficiency Tips

Programming in a functional language means that one should focus on algorithms and without wor­ry­ing about all kinds of efficiency details. However, when large applications are being written it may hap­pen that this attitude re­sults in a program that is unacceptably inefficient in time and/or space.

In this Chapter we explain several kinds of annotations and directives that can be defined in Clean. These annotations and directives are designed to give the programmer some means to influence the time and space behavior of Clean applications.

Clean is by default a lazy language: applications will only be evaluated when their results are needed for the final outcome of the program. However, lazy evaluation is in general not very efficient. It is much more efficient to compute function arguments in advance (strict evaluation) when it is known that the arguments will be used in the function body. By using strictness annotations in type defini­tions the evaluation order of data structures and functions can be changed from lazy to strict. This is explained in Section 10.1.

One can define constant graphs on the global level also known as Constant Applicative Forms (see Section 10.2). Unlike constant functions, these constant graphs are shared such that they are computed only one. This generally reduces execution time possibly at the cost of some heap space needed to re­member the shared graph constants.

Macro’s (Section 10.3) are special functions that will already be substituted (evaluated) at compile-time. This generally reduces execution time (the work has already been done by the compiler) but it will lead to an increase of object code.

10.1  Annotations to Change Lazy Evaluation into Strict Evaluation

Clean uses by default a lazy evaluation  strategy: a redex is only evaluated when it is needed to com­pute the final result. Some functional languages (e.g. ML, Harper et al.) use an eager (strict) evaluation strat­egy and always evaluate all function arguments in advance.

10.1.1                          Advantages and Disadvantages of Lazy versus Strict Evaluation

Lazy evaluation has the following advantages (+) / disadvantages (-) over eager (strict) eva­lu­a­tion:

•         only those computations which contribute to the final result are computed (for some algorithms this is a clear advantage while it generally gives a greater expressive freedom);

•         one can work with infinite data structures (e.g. [1..]);

•         it is unknown when a lazy expression will be computed (disadvantage for debugging, for con­t­rol­ling evaluation order);

•         strict evaluation is in general much more efficient, in particular for objects of basic types, non-re­cur­sive types and tuples and records which are composed of such types;

-/+      in general a strict expression (e.g. 2 + 3 + 4) takes less space than a lazy one, however, some­ti­mes the other way around (e.g. [1..1000]);

10.1.2  Strict and Lazy Context

Each expression in a function definition is considered to be either strict (appearing in a strict context: it has to be evaluated to strong root normal form) or lazy (appearing in a lazy context : not yet to be evalu­ated to strong root normal form)  The following ru­les spec­ify whether or not a particu­lar expression is lazy or strict:

•         a non-variable pattern is strict;

•         an expression in a guard is strict;

•         the expressions specified in a strict let-before expression are strict;

•         the root expression is strict;

•         the arguments of a function or data constructor in a strict context are strict when these argu­ments are being an­notated as strict in the type definition of that function (manually or automatically) or in the type definition of the data constructor;

•         all the other expressions are lazy.

Evaluation of a function will happen in the following order: patterns, guard, expressions in a strict let before ex­pres­sion, root expression (see also 3.1).

10.1.3  Space Consumption in Strict and Lazy Context

The space occupied by Clean structures depend on the kind of structures one is using, but also de­pends on whether these data struc­tures appear in a strict or in a lazy context. To understand this one has to have some knowledge about the basic implementation of Clean (see Plasmeijer and Van Eekelen, 1993).

Graphs (see Chapter 1) are stored in a piece of memory called the heap. The amount of heap space needed highly depends on the kind of data structures that are in use. Graph structures that are created in a lazy context can occupy more space than graphs created in a strict context. The garbage collector in the run-time system of Clean automatically collects graphs that are not being used. The arguments of functions being evaluated are stored on a stack. There are two stacks: the A-stack, which contains references to graph nodes stored in the heap and the BC-stack which contains ar­guments of basic type and return addresses. Data structures in a lazy context are passed via references on the A-stack. Data structures of the basic types (Int, Real, Char or Bool) in a strict context are stored on the B-stack or in registers. This is also the case for these strict basic types when they are part of a re­cord or tuple in a strict context.

Data structures living on the B-stack are passed unboxed. They consume less space (because they are not part of a node) and can be treated much more effi­ciently. When a function is called in a lazy context its data structures are passed in a graph node (boxed). The amount of space occupied is also depending on the arity of the function.

In the table below the amount of space consumed in the different situations is summarised (for the lazy as well as for the strict context). For the size of the elements one can take the size consumed in a strict context.

Type

Arity

Lazy context (bytes)

Strict context (bytes)

Comment

Int, Bool

-

8

4

 

Int (0≤n≤32), Char

-

-

4

node is shared

Real

-

12

8

 

Small Record

n

4  + S size elements

S size elements

total length≤12

Large Record

n

8  + S size elements

S size elements

 

Tuple

2

12

S size elements

 

 

>2

8  + 4*n

S size elements

 

{a}

n

20 + 4*n

12 + 4*n

 

 !Int

n

20 + 4*n

12 + 4*n

 

 !Bool,!Char

n

20 + 4*ciel(n/4)

12 + 4*ciel(n/4)

 

 !Real

n

20 + 8*n

12 + 8*n

 

 !Tuple, !Record

n

20 + size rec/tup*n

12 + size rec/tup*n

 

Hnf

0

-

4 + size node

node is shared

 

1

8

4 + size node

 

 

2

12

4 + size node

also for [a]

 

>2

8  + 4*n

4 + size node

 

Pointer to node

-

4

4

 

Function

0,1,2

12

-

 

 

>3

4  + 4*n

-

 

 

10.1.4  Time Consumption in Strict and Lazy Context

Strict arguments of functions can sometimes be handled much more efficiently than lazy arguments, in particular when the arguments are of basic type.

 

Example: functions with strict arguments of basic type are more efficient.

 

Ackerman:: !Int !Int -> Int

Ackerman 0 j = j+1

Ackerman i 0 = Ackerman (i-1) 1

Ackerman i j = Ackerman (i-1) (Ackerman i (j-1))

 

The computation of a lazy version of Ackerman 3 7 takes 14.8 sec­onds + 0.1 seconds for garbage collection on an old fashion Ma­cII (5Mb heap). When both arguments are annotated as strict (which in this case will be done automati­cally by the compiler) the computation will only take 1.5 seconds + 0.0 sec­onds garbage collec­tion. The gain is one order of magnitude. Instead of rewriting graphs the calculation is per­formed using stacks and reg­isters where pos­sible. The speed is comparable with a recursive call in highly optimised C or with the speed ob­tai­n­able when the function was pro­grammed directly in as­sem­bly.

10.1.5  Changing Lazy into Strict Evaluation

So, lazy evaluation gives a notational freedom (no worrying about what is computed when) but it might cost space as well as time. In Clean the default lazy evaluation can therefore be turned into eager evaluation by adding strictness annotations to types.

 

Strict

=

!

This can be done in several ways:

•         The Clean compiler has a built-in strictness analyzer based on abstract reduction (Nöcker, 1993) (it can be optionally turned off). The analyzer searches for strict arguments of a function and an­no­tate them internally as strict (see 10.1.1). In this way lazy arguments are automatically turned into strict ones. This optimization does not influence the termination behavior of the program. It ap­pears that the analyzer can find much infor­mation. The anal­ysis itself is quite fast.

•         The strictness analyzer cannot find all strict arguments. Therefore one can also manually an­no­tate a function as being strict in a certain argument or in its result (see 10.1.1).

•         By using strictness annotations, a programmer can define (partially) strict data structures (Nöcker and Smetsers, 1993; see 10.1.3). Whenever such a data structure oc­curs in a strict context (see 10.1.1), its strict components will be evaluated.

•         The order of evaluation of expressions in a function body can also be changed from lazy to strict by using a strict  let-before expression (see 3.5.4).

One has to be careful though. When a programmer manually changes lazy evaluation into strict evaluation, the termination behavior of the pro­gram might change. It is only safe to put strictness annotations in the case that the func­tion or data con­structor is known to be strict in the cor­responding argument which means that the evaluation of that argu­ment in advance does not change the termination behavior of the pro­gram. The compiler is not able to check this.

10.2  Defining Graphs on the Global Level

Constant graphs can also be defined on a global level (for local constant graphs see 3.6).

 

GraphDef

=

Selector =[:] GraphExpr ;

A global graph definition defines a global constant (closed) graph, i.e. a graph which has the same scope as a global function definition (see 2.1). The selector variables that occur in the selectors of a global graph defi­nition have a global scope just as globally defined functions.

Special about global graphs (in contrast with local graphs) is that they are not garbage col­lected dur­ing the evalua­tion of the pro­gram  A global graph can be compared with a CAF (Constant Ap­plicative Form): its value is com­puted at most once and re­m­embered at run-time. A global graph can save execu­tion-time at the cost of permanent space con­sumption.

Syntactically the definition of a graph is distinguished from the definition of a function by the sym­bol which sepa­rates left-hand side from right-hand side: "=:" is used for graphs while "=>" is used for func­tions. However, in general the more common symbol "=" is used for both type of definitions. Gen­erally it is clear from the context what is meant (functions have parameters, selectors are also easy recognisa­ble). However, when a simple con­stant is defined the syntax is ambiguous (it can be a constant func­tion definition as well as a constant graph definition).

To allow the use of the "=" whenever possible, the following rule is followed. Locally constant defini­tions are by default taken to be graph definitions and therefore shared, globally they are by default taken to be function defini­tions (see 3.1) and therefore recomputed. If one wants to obtain a different behavior one has to explicit state the na­ture of the constant definition (has it to be shared or has it to be recomputed) by using "=:" (on the global level, meaning it is a constant graph which is shared) or "=>" (on the local level, meaning it is a constant function and has to be recomputed).

 

Global constant graph versus global constant function definition: biglist1 is a graph which is computed only once, biglist3 and biglist2 is a constant function which is computed every time it is ap­plied.

 

biglist1 =   [1..10000]                // a constant function (if defined globally)

biglist2 =:  [1..10000]                // a graph

biglist3 =>  [1..10000]                // a constant function

A graph saves execution-time at the cost of space con­sumption. A constant function saves space at the cost of execution time. So, use graphs when the com­putation is time-consuming while the space con­sumption is small and constant func­tions in the other case.

10.3  Defining Macros

Macros are functions (rewrite rules) which are applied at compile-time instead of at run-time. Macro’s can be used to define constants, create in-line substitutions, rename functions, do conditional compila­tion etc. With a macro definition one can, for instance, assign a name to a constant such that it can be used as pattern on the left-hand side of a function definition.

At compile-time the right-hand side of the macro definition will be substituted for every application of the macro in the scope of the macro definition. This saves a function call and makes basic blocks larger (see Plasmeijer and Van Eeke­len, 1993) such that better code can be generated. A dis­advantage is that also more code will be gen­er­a­ted. Inline substitution is also one of the regular optimisations performed by the Clean compiler. To avoid code explosion a compiler will generally not substitute big functions. Macros give the pro­grammer a possi­bility to control the substitution process manually to get an op­ti­mal trade-off between the effi­ciency of code and the size of the code.

 

MacroDef

=

[MacroFixityDef]

 

 

DefOfMacro

MacroFixityDef

=

(FunctionName) [Fix][Prec] ;

DefOfMacro

=

Function {Variable} :== FunctionBody ;

 

 

[LocalFunctionAltDefs]

The compile-time substitution process is guaranteed to terminate. To ensure this some restrictions are imposed on Macro’s (compared to common functions). Only variables are allowed as for­mal argu­ment. A macro rule always consists of a single alternative. Furthermore,

8)       Macro definitions are not allowed to be cyclic to ensure that the substitution process terminates.

 

Example of a macro definition.

 

Black    :== 1                                       // Macro definition

White    :== 0                                       // Macro definition

 

:: Color :== Int                                     // Type synonym definition

 

Invert:: Color -> Color                              // Function definition

Invert Black = White

Invert White = Black

 

Example: macro to write (a?b) for lists instead of [a:b] and its use in the function map.

 

(?) infixr 5                                         // Fixity of Macro

(?) h t :== [h:t]                                    // Macro definition of operator

 

map:: (a -> b) [a] -> [b]

map f (x?xs) = f x ? map f xs

map f []     = []

Notice that macros can contain local function definitions. These local definitions (which can be recur­sive) will also be substituted inline. In this way complicated substitutions can be achieved resulting in efficient code.

 

Example: macros can be used to speed up frequently used functions. See for instance the definition of the function foldl in StdList.

 

foldl op r l :== foldl r l                           // Macro definition

where

    foldl r []    = r

    foldl r [a:x] = foldl (op r a) x

 

sum list = foldl (+) 0 list

 

After substitution of the macro foldl a very efficient function sum will be generated by the compiler:

 

sum list = foldl 0 list

where

    foldl r []    = r

    foldl r [a:x] = foldl ((+) r a) x

The expansion of the macros takes place before type checking. Type specifications of macro rules are not possible. When operators are defined as macros, fixity and associativity can be defined.

10.4  Efficiency Tips

Here are some additional suggestions how to make your program more efficient:

•         Use the Clean profiler to find out which frequently called functions are consuming a lot of space and/or time. If you modify your program, these functions are the one to have a good look at.

•         Transform a recursive function to a tail-recursive function.

•         It is better to accumulate results in parameters instead of in the right-hand side results.

•         It is better to use records instead of tuples.

•         Arrays can be more efficient than lists since they allow constant access time on their elements and can be destructive updated.

•         When functions return multiple ad-hoc results in a tuple put these results in a strict tuple instead (can be indicated in the type).

•         Use strict data structures whenever possible.

•         Export the strictness information to other modules (the compiler will warn you if you don’t).

•         Make function strict in its arguments whenever possible.

•         Use macros for simple constant expressions or frequently used functions.

•         Use CAF’s and local graphs to avoid recalculation of expressions.

•         Selections in a lazy context can better be transformed to functions which do a pattern match.

•         Higher order functions are nice but inefficient (the compiler will try to convert higher order func­tion into first order functions).

•         Constructors of high arity are inefficient.

•         Increase the heap space in the case that the garbage collector takes place too often.