AGFL alphabet translation possibilities
=======================================

AGFL provides the possibility to users to define how
input text characters should be interpreted prior to further processing.



default character translation
------- --------- -----------

By default (i.e., when no user-defined translation is available),
the AGFL system:
    - uses full 8-bit character encoding for the lexica;
    - translates capitals in the input to lowercase letters ("A"->"a")
      prior to further processing;
    - and leaves all other characters as-is.



overriding the default
---------- --- -------

To override the default, the user should 
    - provide a file describing the translation required;
    - let the environment variable "AGFL_ALPHABET" point to this file.

An example of an alphabet translation file is given in this directory:
"agfl_alphabet.txt". The translation defined in this example file
is exactly equal to the default behaviour of the system.
The AGFL user can take (a copy of) the example file and modify it
to his liking.

The environment variable can be set as follows:
    (windows)	SET AGFL_ALPHABET=c:\agfl_2.3.0\alphabet\agfl_alphabet.txt
    (unix)	export AGFL_ALPHABET=/c/agfl_2.3.0/alphabet/agfl_aphabet.txt

(in stead of c:\agfl_2.3.0\alphabet\agfl_alphabet.txt any legal disk,
path and file name seen fit by the user can be used).

It is suggested to include this definition in the file
agfl_2.3.0\bin\setagfl.bat, which already contains a commented-out
command line that defines the environment variable AGFL_ALPHABET
in such a way that it points to the example file in this directory.
Activating the setting in this command file will make sure the
alphabet translation is activated together with the definition of
all other symbols and settings required to use the AGFL system.



returning to the default
--------- -- --- -------

To return to the default, either
    - remove or comment out the setting of AGFL_ALPHABET,
      quit and restart the DOS box;
    - re-define the setting on the command line by either
      undoing the definition by means of
          SET AGFL_ALPHABET=
      or making the AGFL_ALPHABET variable point to the "default"
      alphabet translation file:
          SET AGFL_ALPHABET=c:\agfl_2.3.0\alphabet\agfl_alphabet.txt



let the run-time system report on the translation table used
--- --- -------- ------ ------ -- --- ----------- ----- ----

Once a parser is generated, the parser will report the active character
translations when started with the "-A" option.
This option will list all active translations prior to accepting input.



the alphabet translation file layout
--- -------- ----------- ---- ------

The alphabet translation file should contain a line for each requested character translation.

Each line must contain the values of the character to be translated from, and the character to be translated to.
No character values below 0 or above 255 are accepted. NB: values 0 to 127 normally denote the ASCII character set,
where characters '0' up to '9' have values 48 up to 57, 'A' to 'Z' have values 65 to 90, and 'a' to 'z' have values
97 to 122.

Example: if capital-a ('A') should be translated to lowercase-a ('a'),
the file should contain the line

65	97

Non-numeric text following he character values is ignored (which means
that additional textual comments are possible). E.g:

66	98	translate uppercase-B to lowercase-b

The run-time system checks the availability of the alphabet translation file
and check it's layout, and will give appropriate error messages.



final remarks
----- -------

   - The character translation mechanism will not influence the way
     characters are handled by windows.
     Example: re-definition of control-c (character value 3)
     will not prevent windows from stopping the parser
     when the user presses control-c when the system waits for input.

   - re-definition of unprintable characters (normally, anything below 32)
     can give unexpected and undesired results.

   - Representation of characters in windows depends on the windows language definition
     and the character set used, but even then, strange effects are possible.
     For several language definitions we tested it turned out that
     representation of characters with values over 127 depend on the program used.
     Example: some editors show for some value a vowel+accent combination,
     whereas others show a double left corner character to be used for line drawing.

     This is unfortunate, but can not be influenced by us. Finding the proper re-definition
     pairs for your situation might require some trial-and-error activity.
