Table of contents:
This document describes the CLC-INTERCAL configuration system for escape 1.-94.-2.1 or newer. The file format used before was different and was never documented: it is now considered obsolete. The format described here has an extension mechanism which may be sufficient to avoid any incompatible changes in the future, and if we think of something so different it doesn't fit, perhaps we'll have a second file format to use alongside this one. Meanwhile, the format described here is considered stable.
Starting with CLC-INTERCAL 1.-94.-2.1, CLC-INTERCAL's configuration has been split into several files, with each extension optionally installing a file containing its defaults, and both the system administrator and each user able to add their own changes to these defaults.
The package-installed files are found in the package installation "Include" directory, which is the same directory as any pre-compiled objects. The file name for all these files is NAME,sickrc, where NAME is either "system" for the main configuration file, or the name of an extension, for example "INET" for the INTERNET extension. The "system" file is always read first, followed by any installed extension configuration.
After reading the package-installed configuration, the system reads all files in the directory "/etc/sick", processing them in lexycographic order (using the "C" locale). These files can add to package-installed configuration, or they can replace it with completely new items. Processing of these files can be blocked by specifying the --nosystemrc option to sick, intercalc or when running any generated object.
Finally, if a file ".sickrc" exists in the user's home directory, that will be processed and can augment or replace settings in the previously processed files. Processing of the user's ".sickrc" file can be blocked by specifying the --nouserrc option to sick, intercalc or when running any generated object.
An easy way to list all configuration files the system can find, and to mark the ones it would actually read, is to type:
sick [--nouserrc] [--nosystemrc] [-Idirectory]... --rclistThis will show the files found, one per line, with a "splat" next to the ones it would actually read, taking into account the other options supplied. Note that if two files with the same name are present in different directories, this will list both, but only one of them will be marked by a splat.
Each configuration file has a simple and regular syntax, which can contain comments, setting of scalar option values, deleting all values from an array option or appending values to an array option: the last two can be used to replace array contents.
Comments start like in INTERCAL with "PLEASE NOT" or "DO NOT" followed by any text, and they extend until a valid configuration item starts.
Any configuration item, even comments, can be preceded by a conditional with the generic syntax:
WHEN I IMITATE whowhere who is one of "sick", "ick" or "1972": the item will be ignored when the compiler is imitating somebody else.
Setting of a scalar value has the generic syntax:
I CAN VERB OBJECTwhere VERB determines which value is being set, and OBJECT determines how it is being set. The VERB must correspond to a valid scalar option, which is defined by the INTERCAL system or by a installed extension. For example, if the INTERNET extension is installed, it uses the VERB "BLURT" to determine which TCP and UDP port it will use, so the OBJECT supplied to "BLURT" must be a valid port number, so:
I CAN BLURT 64928means to use port 64928 for all INTERNET-related communications.
Appending a value to an array has exactly the same generic syntax as setting a scalar value:
I CAN VERB OBJECTthe difference being that VERB must be defined as an array option rather than a scalar. For arrays, a related statement removes all existing elements:
I DON'T VERBFor example, looking again at the INTERNET extension, the list of IPv6 multicast groups is specified using the verb "READ" because the server will READ OUT discovery packets to these groups, and so to replace the array with just "all nodes" on the locally connected networks, plus a global scope group, one could say:
I DON'T READ I CAN READ ff02::1 I CAN READ ff1e::42Omitting the first line would append two groups to any existing list.
Some arrays require each element to have an explicit priority expressed as a number after the "CAN": for example the base system contains support for four character sets: to guess the one used by a source program, they are tried in order of increasing priority. The default definition is:
I DON'T WRITE WHEN I IMITATE sick I CAN #20 WRITE Baudot I CAN #30 WRITE EBCDIC I CAN #10 WRITE ASCII I CAN #40 WRITE Hollerithwhich means that ASCII is preferred (lowest value actually means highest priority), followed by Baudot, EBCDIC and finally Hollerith. Obviously the verb "WRITE" is used for this as the compiler will be WRITING IN programs. Additionally, the Baudot set will be skipped when imitating "ick" or "1972".
For arrays, it is also possible to remove a single element by using "I CAN'T"; this is most useful in user's configuration, to remove support for something installed in the system. It is not an error to remove something which wasn't there in the first place. For example, a user who knows for a fact they'll never use punched cards or Baudot terminals could remove support by having the lines:
I CAN'T WRITE Hollerith I CAN'T WRITE BaudotIn their ".sickrc" file. Note that this does not specify a priority: if "Hollerith" was in the list, it'll be removed, and the priority is not important. In this example, this would be equivalent to having:
I DON'T WRITE I CAN #30 WRITE EBCDIC I CAN #10 WRITE ASCIIHowever if the system had more character sets the first version would leave all the extra ones in, the second would remove them all and keep just ASCII and EBCDIC.
The next few sections detail which VERBs are defined by the base system and all extensions currently provided, together with the exact syntax of their OBJECTs.
The base system defines a number of VERBs:
The "UNDERSTAND" VERB controls selection of compilers and options based on the suffix. The general theory is that the OBJECTs contain patterns and actions: the suffix is matched against each pattern in turn, and if the match succeeds, the system performs the corresponding actions. There are three types of actions: adding a compiler or option to the list being constructed, making sure a compiler or option will not be added by later processing, and restarting the matching.
The generic syntax for an OBJECT of "UNDERSTAND" is:
I CAN UNDERSTAND PATTERN [+ PATTERN]... AS STRING WITH OPTION [+ OPTION]... RETRYING STRING IGNORING OPTION [+ OPTION]...where the items listed in the first two lines are always present, each of the other three lines can be completely omitted. And of course we split things in lines for clarity but a single long line or splitting at random places will all do perfectly well.
Patterns are simple strings which may contain one or more whirlpool (@) symbols: these act as wildcards, matching anything except the first character of the whole string. Since the string usually starts with a spot (.), this means that the whirlpool matches anything except a spot, and this means that a name containing more than one spot will only try matching against the last section, the spotless suffix. For example, the following will match any suffix ending with "ci" or "ti"
I CAN UNDERSTAND .@ci + .@ti AS "CI or TI suffix"
The "AS" and a string must always be provided. The compiler doesn't do anything with them, but can print them out in verbose mode, so it helps if the string describes what is going on.
The simplest action is "RETRYING": this takes a string, which can contain a single whirlpool, and causes the suffix matching to restart, using the string as new suffix; if the string contains a whirlpool, this will be replaced with everything matched by a whirlpool in the pattern. Note that restarting does not clear the list of options being constructed. For example, the following would simply ignore a letter "z" anywhere in the suffix:
I CAN UNDERSTAND .@z@i AS "NO z" RETRYING .@ithe pattern matches any suffix containing a "z" and ending with a "i", then restart the processing with the same suffix but omitting the "z": if presented with .abzcdei it will start the processing again with .abcdei
The "WITH" action, followed by a list of strings adds each of these strings to the current list of options, unless they have already been added or they are being ignored. For example:
I CAN UNDERSTAND .@7@i AS "BASE 7" WITH 7 RETRYING .@imatches a suffix containing a 7, adds the option "7" (which will load 7.io which in turn sets the base to 7) and then restart processing after removing the 7 from the suffix.
The last action is "IGNORING" which does not make any changes to the list already constructed, but asks that some options will not be added in later processing. For example, the following forces the compiler to be sick and makes sure that ick is not added later:
I CAN UNDERSTAND .@clc@i AS "CLC-INTERCAL" WITH sick IGNORING ick RETRYING .@iPresented with a suffix .clcti it will set the compiler to "sick" then restart processing with suffix .ti - this would normally set the compiler to ick for compatibility with the "thick" compiler, however the "IGNORING" makes sure that this does not happen, so when the new suffix matches the rule:
I CAN UNDERSTAND .@t@i AS "Threaded INTERCAL" WITH ick + thick IGNORING sick RETRYING .@iit will add the thick option but won't add ick because it's been IGNOREd. The "IGNORING sick" in this case does not do anything as sick is already present in the list constructed before; however it is there to make sure that only one compiler can be selected by a combination of suffixes.
The "GLUE" VERB controls selection of system libraries in the style of C-INTERCAL: this is INTERCAL source code which is compiled together with the program specified by the user, and which are included automatically when certain conditions are met. Note that the mechanism could in theory be used with any other compiler.
The generic syntax for an OBJECT of "GLUE" is:
I CAN GLUE FILENAME [AND IF OPTIMISED FILENAME] TO THE END OF THE PROGRAM WHEN CONDITION [AND CONDITION]...One or more CONDITIONs specify when this mechanism will be used, and the first (or only) FILENAME is the name of the library to glue to the program. If the optional "AND IF OPTIMISED" and the second FILENAME is provided, the latter will be used when looking for optimised objects; if not specified, and if the user has asked for optimisation, the compiler will try to guess a suitable optimised object by replacing the suffix with ".o.io", see below for some examples.
The CONDITIONs look like:
COMPILER IS name BASE IS number BASE IS @ BASE IS NOT number PROGRAM USES UNDEFINED LABEL numbner PROGRAM USES UNDEFINED LABELS BETWEEN numbner AND numberIf more than a condition is specified, they must all be true or the rule will be ignored; the value "@" for the base means "any base", and any "@" in the FILENAME will be replaced by the actual base.
For example:
I CAN GLUE syslib.i TO THE END OF THE PROGRAM WHEN COMPILER IS ick AND BASE IS 2 AND PROGRAM USES UNDEFINED LABELS BETWEEN 1000 AND 1999specifies that a program compiled with "ick" in base 2 which makes a reference to a label between 1000 and 1999 but does not define them will include "syslib.i". Since no optimised object has been specified, when the user asks for optimisation the compiler will try to locate a "syslib.o.io" pre-build object, and if found use it, if not found it will include "syslib.i" as in the case when no optimisation is required.
For a base other than 2, the library provided by C-INTERCAL contains the base twice, like syslib3.3i, syslib4.4i etc. To find it, we could say:
I CAN GLUE syslib@.@i TO THE END OF THE PROGRAM WHEN COMPILER IS ick AND BASE IS @ AND BASE IS NOT 2 AND PROGRAM USES UNDEFINED LABELS BETWEEN 1000 AND 1999However the optimised syslib.o.io has been built to work in any base, when necessary it looks up the current base; the compiler would not find it, as it would be looking for syslib3.o.io etc. So we say:
I CAN GLUE syslib@.@i AND IF OPTIMISED syslib.o.io TO THE END OF THE PROGRAM WHEN COMPILER IS ick AND BASE IS @ AND BASE IS NOT 2 AND PROGRAM USES UNDEFINED LABELS BETWEEN 1000 AND 1999
The default system.sickrc provided includes the rules to find syslib*.*i and syslib.o.io as described, and in addition has:
I CAN GLUE floatlib.i TO THE END OF THE PROGRAM WHEN COMPILER IS ick AND BASE IS 2 AND PROGRAM USES UNDEFINED LABELS BETWEEN 5000 AND 5999Together, all this configuration corresponds to the C-INTERCAL algorithm to decide when to include various library files, with two major differences: it is written in an easy-to-understand configuration syntax rather than being obfuscated inside a C program; and the programmer is responsible for obtaining these files, which are at present not provided by CLC-INTERCAL. To assist, the default system.sickrc also says:
I CAN SCAN /usr/share/ick* I CAN SCAN /usr/local/share/ick*These are the most likely places these libraries would be installed, however the configuration can be modified in the usual way if the administrator has installed things somewhere else.
Note that it is not necessary to obtain the C-INTERCAL's system and floating-point libraries if the programmer accepts the use of the optimiser: in this case, the program will link to a Perl module provided by CLC-INTERCAL rather than build C-INTERCAL's libraries from source.
The INTERNET extension, when installed, defines three new VERBs, "BLURT" to specify the TCP and UDP port, "READ" to specify a list of IPv6 multicast groups to use when looking for other programs, and "THROW" for the default multicast hop limits. The included "INET.sickrc" contains:
I CAN BLURT 64928 I DON'T READ I CAN READ ff02::1 THROWING 1 I DON'T THROW I CAN THROW 0 TO 1 I CAN THROW 1 TO 2 I CAN THROW 10And the rest of this section attempts to explain what that measn.
The "BLURT" VERB specifies the UDP and TCP port numbers used for all network communications: this includes TCP connections to a theft server, UDP broadcasts and multicasts for node discovery, and UDP replies to node discovery requests. The OBJECT is a simple number between 1 and 65535, without the initial "#".
The "READ" VERB specifies an array of multicast groups used for node discovery: the program READs out to these groups whenever it runs a CASE, STEAL or SMUGGLE statement which requires to obtain a list of hosts. Additionally, the theft-server will join all these groups and waits for packets on them.
The "READ" array does not have priorities: the program reads out to all the groups listed, then waits for replies. The syntax of a single element is an IPv6 multicast group address in normal presentation format, optionally followed by the gerund "THROWING" and a hop limit: this indicates how far the program will try to throw the query: 0 means that it only stays on the local host and will reach the local theft server; 1 means that the packet will go out on locally-attached networks, but will not cross gateways, even if they have working multicast routing set up; and a number n greater than 1 means that they will attempt to cross gateways but they stop when they encounter the n-th one, or in other words they will cross at most n-1 gateways.
The "THROW" array specifies default hop limits for any multicast groups which do not provide one: this includes any elements of the "READ" array without a "THROWING" gerund, as well as any multicast addresses specified by the program using the CASE, STEAL or SMUGGLE statement, since there is at present no way to indicate a hop limit in an INTERCAL program. The syntax of each element of the "THROW" array is the hop limit as a number followed optionally by "TO" and a scope identifier: when sending a packet to scope S the program will find the first array element with "TO S" and uses the corresponding hop limit: if no such element exists, it will locate the first one which does not include the "TO", and if none is found it will use the system default.
Therefore the included "INET.sickrc" means: use UDP and TCP port 64928; when searching for servers, use the "all nodes" multicast group ff02::1, setting the hop limit to 1 (locally-attached networks only); and when sending to a multicast group which does not specify a hop limit, use limit 0 (local host only) for packets with node-local scope, hop limit 1 (local networks only) for packets with link-local scope, and hop limit 10 for any other scopes.
The calculator, intercalc adds two scalar VERBs to specify the default operation mode and the default compiler and options to use. The "ICALC.sickrc" provides system defaults:
I CAN OPERATE full I CAN CALCULATE sick + 2In general, the operation mode ("OPERATE") can be one of "full", "expr" or "oic" and the default compiler can be any installed compiler followed by any options one would want to pass: the default specifies CLC-INTERCAL and base 2, with no other options. Note that "CALCULATE" takes a list of object names separated by intersections, and is not an array.
The "Save settings" menu item (or backspark-c in the line mode interface) reads the current configuration out to the user's ".sickrc" file; this will contain any values previously found there, modified with the options selected while running the calculator. Options which were not changed in the ".sickrc" file will not be included, but can of course be added with a normal text editor if so desired.
Because the desk calculator is currently the only thing which uses any interfaces, the ICALC extension also adds a number of VERBs related to displaying things. It is possible that this may move to the Base package in the far future when sick also starts using interfaces. Note that not all this configuration applies to all possible interfaces: they will ignore what does not apply to them.
All the VERBs controlling output specify arrays: for each of these VERBs, and for each type of item, the first applicable entry will be used, and if there is no applicable entry the default value will be used. Note that there are no explicit priorities, so the order these things appear in the configuration is significant.
The "DRAW" VERB selects a font and size to use for drawing selected elements; obviously this is ignored by interfaces like Curses which rely on a terminal to display characters and have no control on what font this will use. The general format of the OBJECT is:
optionally followed by a size:I CAN DRAW
list-of-itemsIN
font-name
AT
font-size
optionally followed by a list of interfaces this configuration applies:
WHEN USING
list-of-interfaces
The list-of-items and list-of-interfaces are explained
below; the font-name is any font name recognised by the system
and font-size is a number.
A list-of-items is one or more of the following items, separated by intersections:
A list-of-interfaces is just a list of user interface modules which this configuration will apply to. If omitted, it defaults to "all the interfaces" which at the time of writing means "WHEN USING Curses + X". Note that the interface names have to be written exactly or they won't be recognised.
The "FRAME" VERB applies to interfaces which produce characters, as opposed to pixels, currently this means the Curses interface. The OBJECT is the word "WITH" followed by either "ASCII" or "LINE DRAWING", and then the optional list of interfaces it applies to. For example:
I CAN FRAME WITH LINE DRAWING
this controls whether Unicode line drawing characters will be used, which
requires that they are supported by the terminal; if the terminal does not
support them, they will be displayed as blanks, but the following will
revert to the old behaviour (before 1.-94.-2.1) which used normal ASCII
characters to approximate things:
I CAN FRAME WITH ASCII
The list of interfaces could be useful for example if a future interface
provides a functionality similar to Curses but more powerful, let's
say we'll be calling that new interface Excursion:
I CAN FRAME WITH ASCII WHEN USING Curses
I CAN FRAME WITH LINE DRAWIN WHEN USING Excursion
The "POINT" VERB applies to interfaces in which the use of mouse is optional: this currently means the Curses interface only. The OBJECT is either "WITH THE MOUSE" or "WITHOUT THE MOUSE", and determines whether the interface will grab the mouse and translate mouse click events to button presses, or will leave it to its default function in the environment it runs on. If the mouse is not supported, this option will have no effect.
The "PAINT" VERB specifies a colour, and optionally a background, for some items. The general form is:
optionally followed by a background colour:I CAN PAINT
list-of-itemsIN
colour-name
ON
colour-name
optionally followed by a list of interfaces this configuration applies:
WHEN USING
list-of-interfaces
The list-of-items and list-of-interfaces are as above;
the colour-name is currently one of "black", "white", "red",
"green", "blue", "yellow", "magenta" or "cyan". Future versions will
support more colours, depending on the interface's capabilities.
The remaining VERBs all have the same format:
I CAN
verb list-of-items
optionally followed by a list of interfaces this configuration applies:
WHEN USING
list-of-interfaces
The list-of-items and list-of-interfaces are as above;
the verb is one of:
It is often necessary to specify several VERBs for each type of item, for example to make the current item display in 36 point Courier bold italic and green on a red background (we make no promise that the result is legible), and enabled keys the same but red on a black background:
I CAN PAINT CURRENT ITEM IN GREEN ON RED
I CAN PAINT ENABLED KEYS IN RED ON BLACK
I CAN DRAW CURRENT ITEM + ENABLED KEYS IN Courier AT 36
I CAN EMBOLDEN CURRENT ITEM + ENABLED KEYS
I CAN ITALICISE CURRENT ITEM + ENABLED KEYS
The base system defines the verb "SPEAK" to determine which user interface will be preferred by the calculator (and, when it gets a proper development environment, by the compiler itself). The array requires to specify priorities, and the OBJECTs are just names of an interface module. The base system just defined a single interface, "None" which corresponds to the batch mode of intercalc and sick, and has priority 65535:
I DON'T SPEAK I CAN #65535 SPEAK NoneEach user interface installed will add to this list indicating what interfaces it supports and a priority; the files "UI-Name.sickrc" all contain a single line adding one interface; the combined effect of installing all three interface modules will be:
I CAN #100 SPEAK X I CAN #200 SPEAK Curses I CAN #300 SPEAK LineSo X is the preferred one, if installed and running in a graphical environment, after that Curses, if installed and the terminal supports it, the Line if installed and the terminal supports it, and finally None if nothing else works.
A system administrator may decide to install all available interface modules, but because of a sadistic streak makes "None" the preferred one. They install a file "/etc/sick/ui" containing:
I DON'T SPEAK I CAN #1 SPEAK None I CAN #5 SPEAK X I CAN #4 SPEAK Curses I CAN #2 SPEAK Line
A user who has read this documentation undoes the change by putting the following five lines in their ".sickrc" file:
I DON'T SPEAK I CAN #9 SPEAK None I CAN #8 SPEAK X I CAN #7 SPEAK Curses I CAN #6 SPEAK Line
An extension defines a module "Language::INTERCAL::NAME::Extend" containing a subrouting "add_rcdef" to extend the syntax of the configuration files. Note that most extensions will have such a module because it can also add opcodes, registers and splats and most extensions will need to do at least some of that.
The "add_rcdef" is called with three arguments: a code reference, the name of the extension (usually Name) and the name of the module processing this without the initial "Language::INTERCAL::" (this will usually be Rcfile). The subroutine is expected to call the code reference once for each VERB it wants to add, passing it seven arguments:
For example, the INET extension adds a scalar VERB to specify the TCP and UDP port, "BLURT" with a code like:
$code->('INET', 'BLURT', \&_c_blurt, undef, 0, 0, 'Default INTERNET port');the "_c_blurt" function just checks if it has been passed a number between 1 and 65535, and throws an exception if not; it returns the number itself, therefore there is no special PRINT code. The comment is used when saving the configuration which will look something like:
PLEASE NOTE: Default INTERNET port I CAN BLURT 4242
For an example of arrays without priorities, the INET extension also defines another VERB, "READ" with code like:
$code->('READ', \&_c_read, \&_p_read, 1, 0, 'INTERNET multicast groups');In this case, "_c_read" parses an IPv6 address, checks that it is a valid multicast group, and returns the 128-bit binary representation. Therefore it needs a function to convert it back to a string, which is provided by "_p_read". A saved configuration may look like:
PLEASE NOTE: INTERNET multicast groups I DON'T READ I CAN READ ff02::1 I CAN READ ff1e::42
None of the extensions define a VERB which takes a prioritised array, but the mechanism is similar, just set the 5th argument to 1 instead of 0. The CHECK and PRINT functions do not deal with priorities but just with the OBJECT: the compiler will take care of parsing priorities and using them to sort the list. An example can be seen in the base system's source for Language::INTERCAL::Rcfile.