3. The dicod daemon.

The main component of GNU Dico is dicod daemon. It is responsible for serving client requests and for coordinating the work of dictionary modules.

Dicod operates on a set of databases. Each database contains a set of headwords with corresponding articles, therefore it can be regarded as a dictionary, in which articles supply definitions (or translations) for headwords.

Each database has a unique name – a string of characters that serves to identify this particular database in a set of available databases. Two more pieces of textual data are associated with a database. A database information string (or info, for short), supplies a short description of the database. It is a sentence, tersely describing the database, e.g. ‘English-German Dictionary’. A database description provides full description of the dictionary, with author credits and copyright information. The length of this description is not limited.

Both pieces of information can be requested by the remote user. The command SHOW DB lists all available databases along with their descriptions:

 
SHOW DB
110 3 databases present
jargon "Jargon File (4.3.1, 29 Jun 2001)"
deu-eng "German-English Freedict dictionary"
en-pl-naut "English-Polish dictionary of nautical terms"
.
250 ok

Each line of output lists a name of the dictionary, and the corresponding description.

The SHOW INFO command displays full information about a database, whose name is given as its argument:

 
SHOW INFO en-pl-naut
112 information for en-pl-naut
An English-Polish dictionary of nautical terms

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover and Back-Cover Texts
.
250 ok

A definition for any given headword can be requested using the DEFINE command. It takes two arguments, the name of the database and the headword to look for in that database, e.g.:

 
DEFINE en-pl-naut sprit

If the headword is found in the database, its definition will be displayed, otherwise a diagnostic message will be returned, telling that the headword was not found.

There are two operation modes: ‘daemon’ and ‘inetd’.

3.1 Daemon Operation Mode

The ‘daemon’ mode is enabled by mode daemon statement in the configuration file (see mode statement). It is also the default mode. In daemon mode dicod listens for incoming requests on one or several interfaces. Unless the --foreground option is specified, it disconnects from the controlling terminal and switches to background (becomes a daemon). When an incoming connection arrives, it forks a subprocess for handling it.

In this mode the following signals cause dicod to terminate: ‘SIGTERM’, ‘SIGQUIT’, and ‘SIGINT’. The ‘SIGHUP’ signal causes the program to restart. This works only if both the program name and its configuration file name (if given using ‘--config’ option) are absolute file names.

Upon receiving ‘SIGHUP’, dicod first verifies if the configuration file does not contain fatal errors. To do that, the program executes a copy of itself with ‘--lint’ option (see –lint) and analyzes its return value. Only if this check passes, dicod restarts itself. This ensures that the daemon will not terminate due to unnoticed errors in its configuration file.

Upon receiving ‘SIGTERM’, ‘SIGQUIT’, or ‘SIGINT’, the program stops accepting incoming requests and sends the ‘SIGTERM’ signal to all active subprocesses. Then it waits a predefined amount of time for all processes to terminate (see shutdown-timeout). Any subprocesses that do not terminate after this time are sent ‘SIGKILL’ signal. Then, the database modules are unloaded and dicod terminates.

Several command line options are provided that modify the behavior of dicod in this mode. These options are mainly designed for debugging and error-hunting purposes.

The ‘--foreground’ option instructs the server to not disconnect from the controlling terminal and to remain in the foreground. It is often used with ‘--stderr’ option, which instructs dicod to output all diagnostic to the standard error output, instead of syslog which is used by default.

3.2 Inetd Operation Mode

In ‘inetd’ operation mode inetd receives requests from standard input and sends its replies to the standard output. This mode is enabled by mode inetd statement (see mode statement) in configuration file, or by the ‘--inetd’ command line option (see –inetd). This mode is usually used when invoking dicod from ‘inetd.conf’ file, as in example below:

 
dict  stream  tcp  nowait  nobody /usr/local/bin/dicod --inetd

3.3 Configuration

Upon startup, dicod reads its settings and database definitions from a configuration filedicod.conf’. By default it is located in $sysconfidr (i.e., in most cases ‘/usr/local/etc’, or ‘/etc’), but an alternative location may be specified using ‘--config’ command line option (see –config).

If any errors are encountered in the configuration file, the program reports them on the standard error and exits with a non-zero status.

To test the configuration file without starting the server use ‘--lint’ (‘-t’) command line option. It causes dicod to check configuration file and to exit with status 0 if no errors were detected, and withs status 1 otherwise.

Before parsing, configuration file is preprocessed using m4 (see section Using Preprocessor to Improve the Configuration.). To see the preprocessed configuration without actually parsing it, use ‘-E’ command line option. To avoid preprocessing it, use ‘--no-preprocessor’ option.

The rest of this section describes the configuration file syntax in detail. You can receive a concise summary of all configuration directives any time by running dicod --config-help.

3.3.1 Configuration File Syntax

A dicod configuration consists of statements and comments.

There are three classes of lexical tokens: keywords, values, and separators. Blanks, tabs, newlines and comments, collectively called white space are ignored except as they serve to separate tokens. Some white space is required to separate otherwise adjacent keywords and values.

3.3.1.1 Comments

Comments may appear anywhere where white space may appear in the configuration file. There are two kinds of comments: single-line and multi-line comments. Single-line comments start with ‘#’ or ‘//’ and continue to the end of the line:

 
# This is a comment
// This too is a comment

Multi-line or C-style comments start with the two characters ‘/*’ (slash, star) and continue until the first occurrence of ‘*/’ (star, slash).

Multi-line comments cannot be nested.

3.3.1.2 Pragmatic Comments

Pragmatic comments are similar to usual comments, except that they cause some changes in the way the configuration is parsed. Pragmatic comments begin with a ‘#’ sign and end with the next physical newline character. As of GNU Dico version 2.1, the following pragmatic comments are understood:

#include <file>
#include file

Include the contents of the file file. If file is an absolute file name, both forms are equivalent. Otherwise, the form with angle brackets searches for the file in the include search path, while the second one looks for it in the current working directory first, and, if not found there, in the include search path.

The default include search path is:

  1. prefix/share/dico/2.1/include
  2. prefix/share/dico/include

Where prefix is the installation prefix.

New directories can be appended in front of it using ‘-I’ (‘--include-dir’) command line option (see –include-dir).

#include_once <file>
#include_once file

Same as #include, except that, if the file has already been included, it will not be included again.

#line num
#line num "file"

This line causes dicod to believe, for purposes of error diagnostics, that the line number of the next source line is given by num and the current input file is named by file. If the latter is absent, the remembered file name does not change.

# num "file"

This is a special form of #line statement, understood for compatibility with the C preprocessor.

In fact, these statements provide a rudimentary preprocessing features. For more sophisticated ways to modify configuration before parsing, see Using Preprocessor to Improve the Configuration..

3.3.1.3 Statements

A simple statement consists of a keyword and value separated by any amount of whitespace. Simple statement is terminated with a semicolon (‘;’), unless it contains a here-document (see below), in which case semicolon is optional.

Examples of simple statements:

 
timing yes;
access-log-file /var/log/access_log;

A keyword begins with a letter and may contain letters, decimal digits, underscores (‘_’) and dashes (‘-’). Examples of keywords are: ‘group’, ‘identity-check’.

A value can be one of the following:

number

A number is a sequence of decimal digits.

boolean

A boolean value is one of the following: ‘yes’, ‘true’, ‘t’ or ‘1’, meaning true, and ‘no’, ‘false’, ‘nil’, ‘0’ meaning false.

unquoted string

An unquoted string may contain letters, digits, and any of the following characters: ‘_’, ‘-’, ‘.’, ‘/’, ‘:’.

quoted string

A quoted string is any sequence of characters enclosed in double-quotes (‘"’). A backslash appearing within a quoted string introduces an escape sequence, which is replaced with a single character according to the following rules:

Sequence Replaced with
\a Audible bell character (ASCII 7)
\b Backspace character (ASCII 8)
\f Form-feed character (ASCII 12)
\n Newline character (ASCII 10)
\r Carriage return character (ASCII 13)
\t Horizontal tabulation character (ASCII 9)
\\ A single backslash (‘\’)
\" A double-quote.

Table 3.1: Backslash escapes

In addition, the sequence ‘\newline’ is removed from the string. This allows to split long strings over several physical lines, e.g.:

 
"a long string may be\
 split over several lines"

If the character following a backslash is not one of those specified above, the backslash is ignored and a warning is issued.

Two or more adjacent quoted strings are concatenated, which gives another way to split long strings over several lines to improve readability. The following fragment produces the same result as the example above:

 
"a long string may be"
" split over several lines"

Here-document

Here-document is a special construct that allows to introduce strings of text containing embedded newlines.

The <<word construct instructs the parser to read all the following lines up to the line containing only word, with possible trailing blanks. Any lines thus read are concatenated together into a single string. For example:

 
<<EOT
A multiline
string
EOT

Body of a here-document is interpreted the same way as double-quoted string, unless word is preceded by a backslash (e.g. ‘<<\EOT’) or enclosed in double-quotes, in which case the text is read as is, without interpretation of escape sequences.

If word is prefixed with - (a dash), then all leading tab characters are stripped from input lines and the line containing word. Furthermore, if - is followed by a single space, all leading whitespace is stripped from them. This allows to indent here-documents in a natural fashion. For example:

 
<<- TEXT
    All leading whitespace will be
    ignored when reading these lines.
TEXT

It is important that the terminating delimiter be the only token on its line. The only exception to this rule is allowed if a here-document appears as the last element of a statement. In this case a semicolon can be placed on the same line with its terminating delimiter, as in:

 
help-text <<-EOT
        A sample help text.
EOT;
list

A list is a comma-separated list of values. Lists are delimited by parentheses. The following example shows a statement whose value is a list of strings:

 
capability (mime,auth);

In any case where a list is appropriate, a single value is allowed without being a member of a list: it is equivalent to a list with a single member. This means that, e.g. ‘capability mime;’ is equivalent to ‘capability (mime);’.

A block statement introduces a logical group of another statements. It consists of a keyword, followed by an optional value, and a sequence of statements enclosed in curly braces, as shown in the example below:

 
load-module outline {
        command "outline";
}

The closing curly brace may be followed by a semicolon, although this is not required.

3.3.2 Server Settings

Server settings control how dicod is executed on the server machine.

Configuration: user string

Run with the privileges of this user. Dicod does not require root privileges, so it is recommended to always use this statement when running dicod in daemon mode. See section Daemon Operation Mode.

Example:

 
user nobody;

Configuration: group list

If user is given, dicod will drop all supplementary groups and switch to the principal group of that user. Sometimes, however, it may be necessary to retain one or more supplementary groups. For example, this might be necessary to access dictionary databases. The group statement retains the supplementary groups listed in list, e.g.:

 
user nobody;
group (man, dict);

This statement is ignored if user statement is not present or if dicod is running in inetd mode. See section Inetd Operation Mode.

Configuration: mode enum

Sets server operation mode. The argument is one of:

daemon

Run in daemon mode. See section Daemon Operation Mode, for a detailed description.

inetd

Run in inetd mode. See section Inetd Operation Mode, for a detailed description.

This statement is overridden by the ‘--inetd’ command line option. See –inetd.

Configuration: listen list;

Specify IP addresses and ports to listen on in daemon mode. By default, dicod will listen on port 2628 on all existing interfaces. Use listen statement to abridge the list of interfaces to listen on, or to change the port number.

Elements of list can have the following form:

host:port

Specifies an IPv4 socket to listen on. The host part is either a host name or an IP in “dotted-quad” form. The port part is either a numeric port number or a symbolic service name which is found in ‘/etc/services’ file.

Either of the two parts may be omitted. If host is omitted, it defaults to ‘0.0.0.0’, which means “listen on all interfaces”. If port is omitted, it defaults to 2628. In this case the colon may be omitted, too.

Examples:

 
listen localhost:2628;
listen 127.0.0.1;
listen :2628;
filename

Specifies the name of a UNIX socket to listen on.

The following statement instructs dicod to listen on the address ‘10.10.10.1’, port 2628 and on the UNIX socket ‘/var/run/dict’:

 
listen (10.10.10.1, /var/run/dict);
Configuration: pidfile string

Store PID of the master process in this file. Default is ‘localstatedir/run/dicod.pid’. Notice that the privileges of this default directory are may be insufficient for dicod to write there after switching to users privileges (see user statement). One solution to this is to create a subdirectory with the same owner as given by user statement and to point the PID file there:

 
pidfile /var/run/dict/dicod.pid;

Another solution is to make PID directory group-writable and to add the owner group to the group statement (see group statement).

Configuration: max-children number

Sets maximum number of sub-processes that can run simultaneously. This is equivalent to the number of clients that can simultaneously use the server. The default is 64 sub-processes.

Configuration: inactivity-timeout number

Set inactivity timeout to the number of seconds. The server will disconnect automatically if remote client did not send any command within this number of seconds. Setting timeout to 0 disables inactivity timeout (the default).

Using this statement along with max-children allows to control the server load.

Configuration: shutdown-timeout number

When the master server is shutting down, wait this number of seconds for all children to terminate. Default is 5 seconds.

Configuration: identity-check boolean

Enable identification check using AUTH protocol (RFC 1413). The received user name or UID can be shown in access log using %l format (see section Access Log).

Configuration: ident-keyfile string

Use encryption keys from the named file to decrypt AUTH replies encrypted using DES.

Configuration: ident-timeout number

Set timeout for AUTH input/output operation to number of seconds. Default timeout is 3 seconds.

3.3.3 Authentication

The server may be configured to request authentication in order to make some databases or some additional information available to the user. Another possible use of authentication is to minimize resource utilization on the server machine.

Authentication setup is simple: first, you define a user authentication database, then you enable it by declaring auth server capability (see section Server Capabilities):

 
capability auth;

User authentication database keeps, for each user name, the corresponding plain text password, and, optionally, names of the groups this user belongs to. Notice, that due to the specifics of DICT authentication scheme (see section The AUTH Command), user passwords are stored in plain text, therefore special care must be taken to protect the contents of your authentication database from compromise.

The database is defined using user-db block statement:

Configuration: user-db url

Declare user authentication database.

Dico's authentication is designed so that various authentication database formats may easily be added. A database is identified by its URL, or Universal Resource Locator. It consists of the following parts (square brackets denoting optional ones):

 
type://[[user[:password]@]host]/path[params]
type

A database type, or format. See below for the list of available database formats.

user

User name necessary to access the database.

password

User password necessary to access the database.

host

Domain name or IP address of a machine running the database.

path

A path to the database. The exact meaning of this element depends on the database protocol. It is described in detail when discussing particular database protocols.

params

A list of protocol-dependent parameters. Each parameter is of the form keyword=name, multiple parameters are separated with semicolons.

If the underlying mechanism requires some additional configuration data that cannot be supplied using URL, these are passed to it using the following statement:

user-db conf: options string

The argument is treated as an opaque string and passed to the authentication ‘open’ procedure verbatim. Its exact meaning depends on the type of the database.

The URL defines how the database is accessed. Another important point is where to get user data from. This is specified by the following two sub-statements:

user-db conf: password-resource arg

Database resource returning user password.

user-db conf: group-resource arg

Database resource returning user groups.

The exact semantics of database resource depends on the type of database being used. For flat text databases, resource means the name of a text file that contains these data, for SQL databases, resource is an SQL query, etc. Below we will discuss URLSs and resources used by each database type.

To summarize, the definition of an authentication database is:

 
# Define user database for authentication.
user-db url {
  # Additional configuration options.
  options string;
  
  # Name of a password resource.
  password-resource resource;

  # Name of a resource returning user group information.
  group-resource resource;
}

3.3.3.1 Text Authentication Database

A text authentication database consists of one or two flat text files — a password file, which contains user passwords, and a group file, which contains user groups. The latter is optional. Both files have the same format:

Record keys in a password file must be unique, i.e. no two records may contain the same first field. Group file may contain multiple records with the same key. For example:

 
$ grep smith pass
smith guessme
$ grep smith group
smith user
smith timing
smith tester

This means that user ‘smith’ has password ‘guessme’ and is a member of three groups: ‘user’, ‘timing’ and ‘tester’.

A URL of a text database begins with ‘text’ and contains only path element, which gives the name of the directory where the database files reside. The name of a password file is given by the password-resource statement. The name of a group file is given by the group-resource statement.

For example, if user passwords are kept in file ‘passwd’ and user groups are kept in file ‘user’, and both files reside in ‘/var/db/dico’ directory, then the appropriate database configuration will be:

 
user-db text:///var/db/dico {
  password-resource passwd;
  group-resource group;
}

3.3.3.2 LDAP Databases.

To configure LDAP user database, you need first to load the ‘ldap’ module (see section LDAP module):

 
load-module ldap {
	command "ldap";
}

The URL of the database is: ‘ldap://host[:port]’, where host is the host name or IP address of the LDAP server, and option port specifies the port number it is listening on (by default, port 389 is assumed).

The password-resource statement specifies the name of an attribute containing the password, and the group-resource supplies the name of the attribute with group name.

Additional configuration data are supplied in the options statement, whose argument is a whitespace-separated list of assignments:

base=base

Sets base DN.

binddn=dn

Sets the DN to bind as.

passwd=string

Sets the password.

tls=bool

When set to ‘yes’, enables the use of TLS encryption.

debug=number

Sets OpenLDAP debug level.

user-filter=filter

A LDAP filter to select the objects describing given user. Any occurrence of ‘$user’ in filter is replaced with the actual user name obtaining during the authentication. Variable expansion occurs much the same way as in the shell. In particular, the variable is expanded only unless it is immediately followed by an alphanumeric character. For example, it occurs in:

 
(uid=$user)

and

 
(uid=$user.1)

But it does not occur in

 
(uid=$users)

If it is necessary to expand the variable in such a context, enclose its name in curly braces:

 
(uid=${user}s)
group-filter=filter

A LDAP filter that selects the user groups. The filter is expanded as in user-filter.

The following example shows a LDAP user database configured for base DN ‘example.com’ which uses ‘posixAccount’ and ‘posixGroup’ objects from ‘nis.schema’:

 
user-db "ldap://localhost" {
  password-resource userPassword;
  group-resource cn;
  options "user-filter=(uid=$user) "
          "group-filter=(&(objectClass=posixGroup)(memberuid=$user)) "
          "base=dc=example,dc=com";
}

A note on password usage is in order here. Most authentication methods require the passwords to be stored in the database in plain text form. The use of encrypted passwords (e.g. MD5 or SHA1) is possible only with ‘LOGIN’ and ‘PLAIN’ GSASL authentication methods.

3.3.4 Access Control Lists

Access control lists, or ACLs for short, are lists of permissions that can be applied to certain dicod objects. They can be used to control who can connect to the dictionary server and what resources are offered to whom.

An ACL is defined using acl block statement:

 
acl name {
  definitions
}

The name parameter specifies a unique name for that ACL. This name will be used by another configuration statements (See section Security Settings, and see section Database Visibility) to refer to that ACL.

A part between the curly braces (denoted by definitions above), is a list of access statements. There are two types of such statements:

ACL: allow user-group sub-acl host-list

Allow access to resource.

ACL: deny user-group sub-acl host-list

Deny access to resource.

All parts of an access statement are optional, but at least one of them must be present.

The user-group part specifies which users match this entry. Allowed values are the following:

all

All users.

authenticated

Only authenticated users.

group group-list

Authenticated users which are members of at least one of groups listed in group-list.

The sub-acl part, if present, allows to branch to another ACL. The syntax of this group is:

 
acl name

where name is the name of a previously defined ACL.

Finally, the host-list group allows to match client addresses. It consists of a from keyword followed by a list of address specifiers. Allowed address specifiers are:

addr

Matches if the client IP equals addr. The latter may be given either as an IP address or as a host name, in which case it will be resolved and the first of its IP addresses will be used.

addr/netlen

Matches if first netlen bits from the client IP address equal to addr. The network mask length, netlen must be an integer number in the range from 0 to 32. The address part, addr, is as described above.

addr/netmask

The specifier matches if the result of logical AND between the client IP address and netmask equals to addr. The network mask must be specified in “dotted quad” form, e.g. ‘255.255.255.224’.

filename

Matches if connection was received from a UNIX socket filename, which must be given as an absolute file name.

To summarize, the syntax of an access statement is:

 
allow|deny [all|authenticated|group group-list]
           [acl name] [from addr-list]

where square brackets denote optional parts and vertical bar means ‘one of’.

When an ACL is applied to a particular object, its entries are tried in turn until one of them matches, or the end of the list is reached. If a matched entry is found, its command verb, allow or deny, defines the result of ACL match. If the end of list is reached, the result is ‘allow’, unless explicitly specified otherwise.

For example, the following statement defines an ACL named ‘common’, that allows access for any user connected via local UNIX socket ‘/tmp/dicod.sock’ or coming from a local network ‘192.168.10.0/24’. Any authenticated users are allowed, provided that they are allowed by another ACLmy-nets’ (which should have been defined before this definition). Users coming from the network ‘10.10.0.0/24’ are allowed if they authenticate themselves and are members of groups ‘dicod’ or ‘users’. Access is denied for anybody else:

 
acl common {
    allow all from ("/tmp/dicod.sock", "192.168.10.0/24");
    allow authenticated acl "my-nets";
    allow group ("dicod", "users") from "10.10.0.0/24";
    deny all;
}

See section Security Settings, for information on how to control daemon security settings.

See section Database Visibility, for a detailed description on how to use ACLs to control access to databases.

3.3.5 Security Settings

This subsection describes configuration settings that control access to various resources served by dicod.

Configuration: connection-acl acl-name

Use ACL acl-name to control incoming connections. The ACL itself must be defined before this statement. Using user-group (see previous subsection) in this ACL makes no sense, because authentication is performed after connection is established.

 
acl incoming-conn {
   allow from 213.130.0.0/19;
   deny any;
}

connection-acl incoming-conn;
Configuration: show-sys-info acl-name

This statement controls whether to show system information in reply to SHOW SERVER command (see section SHOW SERVER). The information will be shown only if ACL acl-name allows it.

The system information shown includes the following data: name of the package and its version, name of the system where it was built and the kernel version thereof, host name, total operational time of the daemon, number of subprocesses executed so far and average usage frequency. For example:

 
dicod (dico 2.1) on Linux 2.6.24.4,
dict.example.net up 110+04:42:58, 19647044 forks (6867.9/hour)

3.3.6 Logging and Debugging

The directive described in this subsection provide basic logging capabilities.

Configuration: log-tag string

Prefix syslog messages with this string. By default, the program name is used.

Configuration: log-facility string

Set syslog facility to use. Allowed values are: ‘user’, ‘daemon’, ‘auth’, ‘authpriv’, ‘mail’, ‘cron’, ‘local0’ through ‘local7’ (case-insensitive), or a facility number.

Configuration: log-print-severity boolean

Prefix diagnostics messages with a string identifying their severity.

Configuration: transcript boolean

Log session transcript. The lines received from client are prefixed with ‘C:’, those sent in reply are marked with ‘S:’. Here is an excerpt from the transcript output:

 
S: 220 Trurl.gnu.org.ua dicod (dico 1.99.90) <mime.xversion>
<1645.1212874507@Trurl.gnu.org.ua>
C: client ``Kdict''
S: 250 ok
C: show db
S: 110 16 databases present
S: afr-deu ``Afrikaans-German Freedict dictionary''
S: afr-eng ``Afrikaans-English FreeDict Dictionary''
[...]
S: .
S: 250 ok

This option produces lots of output and can significantly slow down the server. Use it only if you are debugging dicod or some remote client. Never use it in a production environment.

3.3.7 Access Log

GNU Dico provides a feature similar to Apache's CustomLog, which allows to keep a log of MATCH and DEFINE requests. To enable this feature, specify the name of the log file using the following directive:

Configuration: access-log-file string

Set access log file name.

 
access-log-file /var/log/dico/access.log;

The format of log file entries is specified using access-log-format directive:

Configuration: access-log-format string

Set format string for access log file.

Its argument can contain literal characters, which are copied into the log file verbatim, and format specifiers, i.e. special sequences which begin with ‘%’ and are replaced in the log file as shown in the table below.

%%

The percent sign.

%a

Remote IP-address.

%A

Local IP-address.

%B

Size of response in bytes.

%b

Size of response in bytes in CLF format, i.e. a ‘-’ rather than a ‘0’ when no bytes are sent.

%C

Remote client (from CLIENT command (see section The CLIENT Command).

%D

The time taken to serve the request, in microseconds.

%d

Request command verb in abbreviated form, suitable for use in URLs, i.e. ‘d’ for DEFINE, and ‘m’ for MATCH. See section DICT URL.

%h

Remote host.

%H

Request command verb (DEFINE or MATCH).

%l

Remote logname (from identd, if supplied). This will return a dash unless identity-check is set to true. See identity-check.

%m

The search strategy.

%p

The canonical port of the server serving the request.

%P

The PID of the child that serviced the request.

%q

The database from the request.

%r

Full request.

%{n}R

The nth token from the request (n is 0-based).

%s

Reply status. For multiple replies, the form ‘%s’ returns the status of the first reply, while ‘%>s’ returns that of the last reply.

%t

Time the request was received in the standard Apache format, e.g.:

 
[04/Jun/2008:11:05:22 +0300]
%{format}t

The time, in the form given by format, which should be a valid strftime format. See section Time and Date Formats, for a detailed description.

The standard ‘%t’ format is equivalent to

 
[%d/%b/%Y:%H:%M:%S %z]
%T

The time taken to serve the request, in seconds.

%u

Remote user from AUTH command.

%v

The host name of the server serving the request. See hostname directive.

%V

Actual host name of the server (in case it was overridden in configuration).

%W

The word from the request.

For the reference, here is the list of format specifiers that have different meaning than in Apache: ‘%C’, ‘%H’, ‘%m’, ‘%q’. The following format specifiers are unique to dicod: ‘d’, ‘%{n}R’, ‘%V’, ‘%W’.

Absence of the access-log-format directive is equivalent to the following statement:

 
access-log-format "%h %l %u %t \"%r\" %>s %b";

It was chosen so as to be compatible with Apache access logs and be easily parsable by existing log analyzing tools, such as webalizer.

Extending this format string with the client name produces a log format similar to Apache ‘combined log’:

 
access-log-format "%h %l %u %t \"%r\" %>s %b \"\" \"%C\"";

3.3.8 General Settings

The settings in this subsection configure basic behavior of the DICT daemon.

Configuration: initial-banner-text string

Display the string in the textual part of the initial server reply.

When connection is established, the server sends an initial reply to the client, that looks like in the example below:

 
220 Trurl.gnu.org.ua <auth.mime> <520.1212912026@Trurl.gnu.org.ua>

See section Initial Reply, for a detailed description of its parts.

The part of this reply after the host name and the first angle bracket is modifiable and can contain arbitrary text. You can use initial-banner-text to append any additional information there. Note, that string may not contain newlines. For example:

 
initial-banner-text "Please authenticate yourself,";

This statement produces the following initial reply (split over two lines for readability):

 
220 Trurl.gnu.org.ua Please authenticate yourself,
  <auth.mime> <520.1212912026@Trurl.gnu.org.ua>

Configuration: hostname string

Set the hostname. By default, the server determines it automatically. If, however, it makes a wrong guess, you can fix it using this directive.

The server hostname is used, among others, in the initial reply after ‘220’ code (see above) and may also be displayed in the access log file using ‘%v’ escape (see section Access Log).

Configuration: server-info string

Set server description to be shown in reply to SHOW SERVER (see section SHOW SERVER) command.

The first line of the reply, after the usual ‘114’ response line, shows the name of host where the server is running. If the settings of show-sys-info (see section show-sys-info) allow, some additional information about the system is printed.

The lines that follow are taken from the server-info directive. It is common to specify string using “here-document” syntax (see here-document), e.g.:

 
server-info <<EOT
Welcome to the FOO dictionary service.

Contact <dict@foo.org> if you have questions or
suggestions.
EOT;
Configuration: help-text string

Set the text to be displayed in reply to the HELP command.

The default reply to HELP command displays a list of commands understood by the server with a short description of each.

You can use help-text directive to append arbitrary text to that output, provided that you begin string with a plus sign, e.g.:

 
help-text <<-EOT
  +
  The commands beginning with an X are extensions.
EOT;

If string begins with any character, except ‘+’, it will replace the default help output. For example:

 
help-text <<-EOT
  There is no help.
  See RFC 2229 for detailed information.
EOT;
Configuration: default-strategy string

Set the name of the default matching strategy (see section The MATCH Command). By default, Levenshtein matching is used, which is equivalent to

 
default-strategy lev;

3.3.9 Server Capabilities

Capabilities are certain server features that can be enabled or disabled at the system administrator's will.

Configuration: capability list

Request additional capabilities from list.

The argument to capability directive must contain names of existing dicod capabilities. These are listed in the following table:

auth

The AUTH command is supported. See section Authentication.

mime

The OPTION MIME command is supported. Notice that RFC 2229 requires all servers to support that command, so you should always specify this capability.

xversion

The XVERSION command is supported. It is a GNU extension that displays the dicod implementation and version number. See section XVERSION.

xlev

The XLEV command is supported. This command allows to set and query maximal Levenshtein distance for lev matching strategy. See section strategy. See section XLEV.

Capabilities set using the capability directive are displayed in the initial server reply (see initial reply), and appropriate entries are added to the HELP command output.

3.3.10 Database Modules and Handlers

A database module is an external piece of software designed to handle a particular format of dictionary databases. This piece of software is built as a shared library so that dicod can load at run time.

A handler is an instance of a database module loaded by dicod and configured for a specific database or a set of databases.

Database handlers are defined using the following block statement:

Configuration: load-module string

Create an instance of a database module. The argument specifies a unique name which will be used by subsequent parts of the configuration to refer to this handler. The handler statement is a block statement. The only sub-statement allowed within it is command statement:

load-module config: command string

Set the command line for this handler. It is similar to shell's command line: it consists of a name of database module, optionally followed by a whitespace-separated list of its arguments. Just as in shell, the name of the module specifies the disk file which should be loaded. Arguments are passed to the module initialization function (see dico_init).

For example:

 
load-module dict {
  command "dictorg dbdir=/var/dicodb";
}

This statement defines a handler named ‘dict’, which loads the module ‘dictorg’ and passes its initialization function a single argument, ‘dbdir=/var/dicodb’. If the module name is not an absolute file name, as in this example, the loadable module will be searched in the module load path.

A module load path is an internal list of directories which dicod scans in order to find a loadable file name specified in command statement of a load-module block. By default the order of search is as follows:

  1. Optional prefix search directories specified by the prepend-load-path directive (see below).
  2. GNU Dico module directory: ‘$prefix/lib/dico’.
  3. Additional search directories specified by the module-load-path directive (see below).
  4. The value of the environment variable LTDL_LIBRARY_PATH.
  5. The system dependent library search path (e.g. on Linux it is set by the contents of the file ‘/etc/ld.so.conf’ and the value of the environment variable LD_LIBRARY_PATH.

The value of LTDL_LIBRARY_PATH and LD_LIBRARY_PATH must be a colon-separated list of absolute directories, for example, ‘/usr/lib/mypkg:/lib/foo’.

In any of these directories, dicod first attempts to find and load the given filename. If this fails, it tries to append the following suffixes to it:

  1. the libtool archive suffix ‘.la
  2. the suffix used for native dynamic libraries on the host platform, e.g., ‘.so’, ‘.sl’, etc.
Configuration: module-load-path list

This directive adds the directories listed in its argument to the module load path. Example:

 
module-load-path (/usr/lib/dico,/usr/local/dico/lib);
Configuration: prepend-load-path list

Same module-load-path, but adds directories to the beginning of the module load path. Example:

3.3.11 Databases

Dictionary databases are defined using database block statement.

Configuration: database { statements }

Define a dictionary database. At least two sub-statements must be defined for each database: name and handler.

Database: name string

Set the name of this database (a single word). This name will be used to identify this database in DICT commands.

Database: handler string

Specify the name of a handler for this database and any arguments for it. This handler must be previously defined using load-module statement (see section Database Modules and Handlers).

For example, the following fragment defines a database named ‘en-de’, which is handled by ‘dictord’ handler. The handler is passed one argument, database=en-de:

 
database {
        name "en-de";
        handler "dictorg database=en-de";
}

More directives are available to fine-tune the database.

Database: description string

Supply a short description, to be shown in reply to SHOW DB command. The string may not contain new-lines.

Use this statement if the database itself does not supply a description, or if its description is malformed.

In any case, if description directive is specified, its value takes precedence over description string retrieved from the database itself.

See section SHOW DB, for a description of SHOW DB command.

Database: info string

Supply a full description of the database. This description is shown in reply to SHOW INFO (see section SHOW INFO) command. The string is usually a multi-line text, so it is common to use here-document syntax (see here-document), e.g.:

 
info <<- EOT
   This is a foo-bar dictionary.
   Copyright (C) 2008 foo-bar dict group.
   Distributed under the terms of GNU Free
   Documentation license.
EOT;

Use this statement if the database itself does not supply a full description, or if its full description is malformed.

As with description, the value of info takes precedence over info strings retrieved from the database.

The following two directives control the content type and transfer encoding used when formatting replies from this database if OPTION MIME (see section OPTION MIME) is in effect:

Database: content-type string

Set the content type of the reply. E.g.:

 
directory {
   name "foo";
   handler "dictorg";
   content-type "text/html";
   ...
}   
Database: content-transfer-encoding enum

Set transfer encoding to use when sending MIME replies for this database. Allowed values for enum are:

base64

Use BASE64 encoding.

quoted-printable

Use quoted-printable encoding.

3.3.11.1 Database Visibility

A property called database visibility is associated with each dictionary database. It determines whether the database appears in the output of SHOW DB command, and takes part in dictionary searches.

By default, all databases are defined as publicly visible. You can, however, abridge their visibility on global as well as on per-directory basis. This can be achieved using visibility ACLs.

In general, the visibility of a database is controlled by two access control lists: global visibility ACL and database visibility ACL. The latter takes precedence over the former.

Both ACLs are defined using visibility-acl statement:

Configuration: visibility-acl acl-name

Set name of an ACL controlling database visibility. If used in global scope, this statement sets global visibility ACL. If used within a database block, it sets visibility ACL for that particular database.

Consider the following example:

 
acl glob-vis {
  allow authenticated;
  deny all;
}  

acl local-nets {
  allow from (192.168.10.0/24, /tmp/dicod.sock);
}

visibility-acl glob-vis;

database {
  name "terms";
  visibility-acl local-nets;
}

In this configuration, the ‘terms’ database is visible to everybody coming from the ‘192.168.10.0/24’ network and from the UNIX socket ‘/tmp/dicod.sock’, without authorization. It is not visible to users coming from elsewhere, unless they authenticate themselves.

3.3.12 Strategies and Default Searches

Default search is a MATCH request with ‘*’ or ‘!’ as database argument (see section The MATCH Command). The former means search in all available databases, the latter means search in all databases until a match is found.

Default searches may be quite expensive and may cause considerable strain on the server. For example, the command MATCH * priefix "" returns all entries from all available database, which would consume a lot of resources both on the server and on the client side.

To minimize harmful effects from such potentially dangerous requests, Dico allows to limit the use of certain strategies in default searches.

Configuration: strategy name { statements }

Restrict the use of strategy name in default searches.

The statements define conditions the 4th argument of a MATCH command must match in order to deny the request. The following statements are defined:

Configuration: deny-all bool

Unconditionally deny this strategy in default searches.

Configuration: deny-word list

Deny this strategy if the search word matches one of the words from list.

Configuration: deny-length-lt number

Deny if length of the search word is less than number.

Configuration: deny-length-le number

Deny if length of the search word is less than or equal to number.

Configuration: deny-length-gt number

Deny if length of the search word is greater than number.

Configuration: deny-length-ge number

Deny if length of the search word is greater than or equal to number.

Configuration: deny-length-eq number

Deny if length of the search word is equal to number.

Configuration: deny-length-ne number

Deny if length of the search word is not equal to number.

For example, the following statement denies the use of ‘prefix’ strategy in default searches if its argument is an empty string:

 
strategy prefix {
  deny-length-eq 0;
}

If the dicod daemon is configured this way, it will always return a ‘552’ reply on commands MATCH * prefix "" or MATCH ! prefix "". However, use of empty prefix on a concrete database, as in MATCH eng-deu prefix "", will still be allowed.

3.3.13 Tuning

While tuning your server, it is often necessary to get timing information which shows how much time is spent serving certain requests. This can be achieved using timing configuration directive:

Configuration: timing boolean

Provide timing information after successful completion of an operation. This information is displayed after the following requests: MATCH, DEFINE, and QUIT. It consists of the following parts:

 
[d/m/c = nd/nm/nc RTr UTu STs]

where:

nd

Number of processed define requests. It is ‘0’ after a MATCH request.

nm

Number of processed match requests. It is ‘0’ after a DEFINE request.

nc

Number of comparisons made. This value may be inaccurate if the underlying database module is not able to count comparisons.

RT

Real time spent serving the request.

UT

Time in user space spent serving the request.

ST

Time in kernel space spent serving the request.

An example of a server reply with timing information follows:

 
250 Command complete [d/m/c = 0/63/107265 2.293r 1.120u 0.010s]

You can also add timing information to your access log files, see %T.

3.3.14 Command Aliases

Aliases allow a string to be substituted for a word when it is used as the first word of a command. The daemon maintains a list of aliases that are created using the alias configuration file statement:

Configuration: alias word command

Create a new alias.

Aliases are useful to facilitate manual interaction with the server, as they allow to create abbreviations for some frequently typed commands. For example, the following alias creates new command d which is equivalent to DEFINE *:

 
alias d DEFINE "*";

Aliases may be recursive, i.e. the first word of command may refer to another alias. For example:

 
alias d DEFINE;
alias da d "*";

This configuration will produce the following expansion:

 
da word ⇒ DEFINE * word

To prevent endless loops, recursive expansion is stopped if the first word of the replacement text is identical to an alias expanded earlier.

3.3.15 Using Preprocessor to Improve the Configuration.

Before parsing configuration file, dicod preprocesses it. The built-in preprocessor handles only file inclusion and #line statements (see section Pragmatic Comments), while the rest of traditional preprocessing facilities, such as macro expansion, is supported via m4, which is used as an external preprocessor.

The detailed description of m4 facilities lies far beyond the scope of this document. You will find a complete user manual in http://www.gnu.org/software/m4/manual. For the rest of this subsection we assume the reader is sufficiently acquainted with m4 macro processor.

The external preprocessor is invoked with ‘-s’ flag, instructing it to include line synchronization information in its output. This information is then used by the parser to display meaningful diagnostic. An initial set of macro definitions is supplied by the ‘pp-setup’ file, located in ‘$prefix/share/dico/version/include’ directory (where version means the version of GNU Dico package).

The default ‘pp-setup’ file renames all m4 built-in macro names so they all start with the prefix ‘m4_’. This is similar to GNU m4 ‘--prefix-builtin’ options, but has an advantage that it works with non-GNU m4 implementations as well.

As an example of how the use of preprocessor may improve dicod configuration, consider the following fragment taken from one of the installations of GNU Dico. This installation offers quite a few Freedict dictionaries. The database definition for each of them is almost the same, except for the dictionary name and eventual description entry for several databases that miss it. To avoid repeating the same text over again, we define the following macro:

 
# defdb(NAME[, DESCR])
# Produce a standard definition for a database NAME.
# If DESCR is given, use it as a description.
m4_define(`defdb', `
database {
        name "$1";
        handler "dictorg database=$1";m4_dnl
m4_ifelse(`$2',,,`
        description "$2";')
}
')

It takes two arguments. The first one, NAME defines the dictionary name visible in the output of SHOW DB command. Optional second argument may be used to supply a description string for the databases that miss it.

Given this macro, the database definitions look like:

 
defdb(eng-swa)
defdb(swa-eng)
defdb(afr-eng, Afrikaans-English Dictionary)
defdb(eng-afr, English-Afrikaans Dictionary)

3.4 Dicod Exit Codes

Apart from issuing a descriptive error message, dicod attempts to indicate the reason of its termination by its error code. As usual, zero exit code indicates normal termination. The table below summarizes all possible error codes. For each error code, it indicates its decimal value and its symbolic name from ‘include/sysexits.h’ (if available).

0
EX_OK

Program terminated correctly.

2

Only child instances of dicod exit with this code. It indicates that the child did not receive any ‘DICT’ command within a time out interval (see inactivity-timeout).

64
EX_USAGE

The program was invoked incorrectly, e.g. an invalid option was given, or an option was supplied erroneous argument.

67
EX_NOUSER

Dicod cannot switch to the privileges of the user it is configured to run as (see user statement).

69
EX_UNAVAILABLE

Server exited due to some error not otherwise described in this table.

70
EX_SOFTWARE

Some internal software error occured.

71
EX_OSERR

Some system error occurred, e.g. the program ran out of memory or file descriptors, ‘fork’ failed, etc.

78
EX_CONFIG

An error in the configuration file was detected.

3.5 Dicod Invocation

This section summarizes dicod command line options.

--config=file

Read this configuration file instead of the default ‘$sysconfdir/dicod.conf’. See section Configuration.

-f
--foreground

Operate in foreground. See section Daemon Operation Mode.

--stderr

Output diagnostic to stderr. See section –stderr.

--syslog

After successful startup, output any diagnostic to syslog. This is the default.

-E

Preprocess configuration file and exit. See section Using Preprocessor to Improve the Configuration..

--preprocessor=prog

Use prog as a preprocessor for configuration file. The default preprocessor command line is m4 -s, unless overridden while configuring the package (see section Default Preprocessor).

See section Using Preprocessor to Improve the Configuration..

--no-preprocessor

Do not use external preprocessor. See section Using Preprocessor to Improve the Configuration..

-I dir
--include-dir=dir

Add the directory dir to the list of directories to be searched for preprocessor include files. See section Using Preprocessor to Improve the Configuration..

-s
--single-process

In daemon mode, process connections in the main process, without starting subprocesses for each connection (see section Daemon Operation Mode). This means that the daemon is able to serve only one client at a time. The ‘--single-process’ option is provided for debugging purposes only. Never use it in production environment.

-T
--transcript

Enable session transcript. This instructs dicod to log all commands it receives and all responses it sends during the session. Transcript is logged via the default logging channel (see section Logging and Debugging). If logging via syslog, the ‘debug’ priority is used.

See also Session Transcript, for a description of the similar mode in dico, the client program.

--no-transcript

Disable transcript mode. This is the default. Use this option if you wish to temporarly disable transcript mode, enabled in the configuration file (see section transcript).

-i
--inetd

Run in inetd mode. See section Inetd Operation Mode.

-x
--debug=level

Set debug verbosity level. The level argument is an integer ranging from ‘0’ (no debugging) to ‘100’ (maximum debugging information).

--source-info

Include source line information in the debugging output.

--trace-grammar

Trace parsing of the config file. The option is provided for debugging purposes.

--trace-lex

Trace config file lexer. The option is provided for debugging purposes.

--config-help

Show configuration file summary. See section Configuration.

-t
--lint

Check configuration file syntax and exit with code ‘0’ if it is OK, or with ‘78’ if there are errors. See section Configuration.

-h
--help

Display a short command line option summary and exit.

--usage

List all available command line options and exit.

--version

Print program version and exit.