|
Dico |
GNU Dictionary Server |
Sergey Poznyakoff |
| GNU Dico Manual (split by chapter): | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
? |
dicod daemon. The main component of GNU Dico is dicod daemon. It is
responsible for serving client requests and for coordinating the work
of dictionary modules.
Dicod operates on a set of databases. Each database
contains a set of headwords with corresponding articles,
therefore it can be regarded as a dictionary, in which articles supply
definitions (or translations) for headwords.
Each database has a unique name – a string of characters that serves to identify this particular database in a set of available databases. Two more pieces of textual data are associated with a database. A database information string (or info, for short), supplies a short description of the database. It is a sentence, tersely describing the database, e.g. ‘English-German Dictionary’. A database description provides full description of the dictionary, with author credits and copyright information. The length of this description is not limited.
Both pieces of information can be requested by the remote user. The
command SHOW DB lists all available databases along with their
descriptions:
SHOW DB 110 3 databases present jargon "Jargon File (4.3.1, 29 Jun 2001)" deu-eng "German-English Freedict dictionary" en-pl-naut "English-Polish dictionary of nautical terms" . 250 ok |
Each line of output lists a name of the dictionary, and the corresponding description.
The SHOW INFO command displays full information about a
database, whose name is given as its argument:
SHOW INFO en-pl-naut 112 information for en-pl-naut An English-Polish dictionary of nautical terms Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover and Back-Cover Texts . 250 ok |
A definition for any given headword can be requested using the
DEFINE command. It takes two arguments, the name of the
database and the headword to look for in that database, e.g.:
DEFINE en-pl-naut sprit |
If the headword is found in the database, its definition will be displayed, otherwise a diagnostic message will be returned, telling that the headword was not found.
There are two operation modes: ‘daemon’ and ‘inetd’.
The ‘daemon’ mode is enabled by mode daemon statement in
the configuration file (see mode statement). It is also the
default mode. In daemon mode dicod listens for incoming
requests on one or several interfaces. Unless the
--foreground option is specified, it disconnects from the
controlling terminal and switches to background (becomes a
daemon). When an incoming connection arrives, it forks a
subprocess for handling it.
In this mode the following signals cause dicod to
terminate: ‘SIGTERM’, ‘SIGQUIT’, and ‘SIGINT’. The
‘SIGHUP’ signal causes the program to restart. This works only
if both the program name and its configuration file name (if given
using ‘--config’ option) are absolute file names.
Upon receiving ‘SIGHUP’, dicod first verifies if the
configuration file does not contain fatal errors. To do that, the
program executes a copy of itself with ‘--lint’ option
(see –lint) and analyzes its return value. Only if this check
passes, dicod restarts itself. This ensures that the daemon
will not terminate due to unnoticed errors in its configuration file.
Upon receiving ‘SIGTERM’, ‘SIGQUIT’, or ‘SIGINT’, the
program stops accepting incoming requests and sends the ‘SIGTERM’
signal to all active subprocesses. Then it waits a predefined amount
of time for all processes to terminate (see shutdown-timeout).
Any subprocesses that do not terminate after this time are sent
‘SIGKILL’ signal. Then, the database modules are unloaded and
dicod terminates.
Several command line options are provided that modify the behavior
of dicod in this mode. These options are mainly designed
for debugging and error-hunting purposes.
The ‘--foreground’ option instructs the server to not
disconnect from the controlling terminal and to remain in the
foreground. It is often used with ‘--stderr’ option,
which instructs dicod to output all diagnostic to the
standard error output, instead of syslog which is used by default.
In ‘inetd’ operation mode inetd receives requests
from standard input and sends its replies to the standard output.
This mode is enabled by mode inetd statement (see mode statement) in configuration file, or by the ‘--inetd’ command
line option (see –inetd). This mode is usually used when
invoking dicod from ‘inetd.conf’ file, as in example
below:
dict stream tcp nowait nobody /usr/local/bin/dicod --inetd |
Upon startup, dicod reads its settings and database
definitions from a configuration file ‘dicod.conf’. By
default it is located in $sysconfidr (i.e., in most cases
‘/usr/local/etc’, or ‘/etc’), but an alternative location
may be specified using ‘--config’ command line option
(see –config).
If any errors are encountered in the configuration file, the program reports them on the standard error and exits with a non-zero status.
To test the configuration file without starting the server use
‘--lint’ (‘-t’) command line option. It causes
dicod to check configuration file and to exit with status 0
if no errors were detected, and withs status 1 otherwise.
Before parsing, configuration file is preprocessed using
m4 (see section Using Preprocessor to Improve the Configuration.). To see the preprocessed
configuration without actually parsing it, use ‘-E’ command
line option. To avoid preprocessing it, use
‘--no-preprocessor’ option.
The rest of this section describes the configuration file syntax in
detail. You can receive a concise summary of all configuration
directives any time by running dicod --config-help.
A dicod configuration consists of statements and comments.
There are three classes of lexical tokens: keywords, values, and separators. Blanks, tabs, newlines and comments, collectively called white space are ignored except as they serve to separate tokens. Some white space is required to separate otherwise adjacent keywords and values.
Comments may appear anywhere where white space may appear in the configuration file. There are two kinds of comments: single-line and multi-line comments. Single-line comments start with ‘#’ or ‘//’ and continue to the end of the line:
# This is a comment // This too is a comment |
Multi-line or C-style comments start with the two characters ‘/*’ (slash, star) and continue until the first occurrence of ‘*/’ (star, slash).
Multi-line comments cannot be nested.
Pragmatic comments are similar to usual comments, except that they cause some changes in the way the configuration is parsed. Pragmatic comments begin with a ‘#’ sign and end with the next physical newline character. As of GNU Dico version 2.1, the following pragmatic comments are understood:
#include <file>#include fileInclude the contents of the file file. If file is an absolute file name, both forms are equivalent. Otherwise, the form with angle brackets searches for the file in the include search path, while the second one looks for it in the current working directory first, and, if not found there, in the include search path.
The default include search path is:
Where prefix is the installation prefix.
New directories can be appended in front of it using ‘-I’ (‘--include-dir’) command line option (see –include-dir).
#include_once <file>#include_once file Same as #include, except that, if the file has already
been included, it will not be included again.
#line num#line num "file" This line causes dicod to believe, for purposes of error
diagnostics, that the line number of the next source line is given by
num and the current input file is named by file.
If the latter is absent, the remembered file name does not change.
# num "file" This is a special form of #line statement, understood for
compatibility with the C preprocessor.
In fact, these statements provide a rudimentary preprocessing features. For more sophisticated ways to modify configuration before parsing, see Using Preprocessor to Improve the Configuration..
A simple statement consists of a keyword and value separated by any amount of whitespace. Simple statement is terminated with a semicolon (‘;’), unless it contains a here-document (see below), in which case semicolon is optional.
Examples of simple statements:
timing yes; access-log-file /var/log/access_log; |
A keyword begins with a letter and may contain letters, decimal digits, underscores (‘_’) and dashes (‘-’). Examples of keywords are: ‘group’, ‘identity-check’.
A value can be one of the following:
A number is a sequence of decimal digits.
A boolean value is one of the following: ‘yes’, ‘true’, ‘t’ or ‘1’, meaning true, and ‘no’, ‘false’, ‘nil’, ‘0’ meaning false.
An unquoted string may contain letters, digits, and any of the following characters: ‘_’, ‘-’, ‘.’, ‘/’, ‘:’.
A quoted string is any sequence of characters enclosed in double-quotes (‘"’). A backslash appearing within a quoted string introduces an escape sequence, which is replaced with a single character according to the following rules:
| Sequence | Replaced with |
| \a | Audible bell character (ASCII 7) |
| \b | Backspace character (ASCII 8) |
| \f | Form-feed character (ASCII 12) |
| \n | Newline character (ASCII 10) |
| \r | Carriage return character (ASCII 13) |
| \t | Horizontal tabulation character (ASCII 9) |
| \\ | A single backslash (‘\’) |
| \" | A double-quote. |
Table 3.1: Backslash escapes
In addition, the sequence ‘\newline’ is removed from the string. This allows to split long strings over several physical lines, e.g.:
"a long string may be\ split over several lines" |
If the character following a backslash is not one of those specified above, the backslash is ignored and a warning is issued.
Two or more adjacent quoted strings are concatenated, which gives another way to split long strings over several lines to improve readability. The following fragment produces the same result as the example above:
"a long string may be" " split over several lines" |
Here-document is a special construct that allows to introduce strings of text containing embedded newlines.
The <<word construct instructs the parser to read all
the following lines up to the line containing only word, with
possible trailing blanks. Any lines thus read are concatenated
together into a single string. For example:
<<EOT A multiline string EOT |
Body of a here-document is interpreted the same way as double-quoted string, unless word is preceded by a backslash (e.g. ‘<<\EOT’) or enclosed in double-quotes, in which case the text is read as is, without interpretation of escape sequences.
If word is prefixed with - (a dash), then all leading
tab characters are stripped from input lines and the line containing
word. Furthermore, if - is followed by a single space,
all leading whitespace is stripped from them. This allows to indent
here-documents in a natural fashion. For example:
<<- TEXT
All leading whitespace will be
ignored when reading these lines.
TEXT
|
It is important that the terminating delimiter be the only token on its line. The only exception to this rule is allowed if a here-document appears as the last element of a statement. In this case a semicolon can be placed on the same line with its terminating delimiter, as in:
help-text <<-EOT
A sample help text.
EOT;
|
A list is a comma-separated list of values. Lists are delimited by parentheses. The following example shows a statement whose value is a list of strings:
capability (mime,auth); |
In any case where a list is appropriate, a single value is allowed without being a member of a list: it is equivalent to a list with a single member. This means that, e.g. ‘capability mime;’ is equivalent to ‘capability (mime);’.
A block statement introduces a logical group of another statements. It consists of a keyword, followed by an optional value, and a sequence of statements enclosed in curly braces, as shown in the example below:
load-module outline {
command "outline";
}
|
The closing curly brace may be followed by a semicolon, although this is not required.
Server settings control how dicod is executed on the
server machine.
Run with the privileges of this user. Dicod does not
require root privileges, so it is recommended to always use this
statement when running dicod in daemon mode.
See section Daemon Operation Mode.
Example:
user nobody; |
If user is given, dicod will drop all supplementary
groups and switch to the principal group of that user. Sometimes,
however, it may be necessary to retain one or more supplementary
groups. For example, this might be necessary to access dictionary
databases. The group statement retains the supplementary
groups listed in list, e.g.:
user nobody; group (man, dict); |
This statement is ignored if user statement is not present or
if dicod is running in inetd mode. See section Inetd Operation Mode.
Sets server operation mode. The argument is one of:
Run in daemon mode. See section Daemon Operation Mode, for a detailed description.
Run in inetd mode. See section Inetd Operation Mode, for a detailed description.
This statement is overridden by the ‘--inetd’ command line option. See –inetd.
Specify IP addresses and ports to listen on in daemon mode.
By default, dicod will listen on port 2628 on all existing
interfaces. Use listen statement to abridge the list of
interfaces to listen on, or to change the port number.
Elements of list can have the following form:
Specifies an IPv4 socket to listen on. The host part is either a host name or an IP in “dotted-quad” form. The port part is either a numeric port number or a symbolic service name which is found in ‘/etc/services’ file.
Either of the two parts may be omitted. If host is omitted, it defaults to ‘0.0.0.0’, which means “listen on all interfaces”. If port is omitted, it defaults to 2628. In this case the colon may be omitted, too.
Examples:
listen localhost:2628; listen 127.0.0.1; listen :2628; |
Specifies the name of a UNIX socket to listen on.
The following statement instructs dicod to listen on
the address ‘10.10.10.1’, port 2628 and on the UNIX
socket ‘/var/run/dict’:
listen (10.10.10.1, /var/run/dict); |
Store PID of the master process in this file.
Default is ‘localstatedir/run/dicod.pid’.
Notice that the privileges of this default directory are
may be insufficient for dicod to write there after switching
to users privileges (see user statement). One solution to this is
to create a subdirectory with the same owner as given by user
statement and to point the PID file there:
pidfile /var/run/dict/dicod.pid; |
Another solution is to make PID directory group-writable and
to add the owner group to the group statement (see group statement).
Sets maximum number of sub-processes that can run simultaneously. This is equivalent to the number of clients that can simultaneously use the server. The default is 64 sub-processes.
Set inactivity timeout to the number of seconds. The server will disconnect automatically if remote client did not send any command within this number of seconds. Setting timeout to 0 disables inactivity timeout (the default).
Using this statement along with max-children allows to control
the server load.
When the master server is shutting down, wait this number of seconds for all children to terminate. Default is 5 seconds.
Enable identification check using AUTH protocol
(RFC 1413). The received user name or UID can
be shown in access log using %l format (see section Access Log).
Use encryption keys from the named file to decrypt AUTH replies encrypted using DES.
Set timeout for AUTH input/output operation to number of seconds. Default timeout is 3 seconds.
The server may be configured to request authentication in order to make some databases or some additional information available to the user. Another possible use of authentication is to minimize resource utilization on the server machine.
Authentication setup is simple: first, you define a user
authentication database, then you enable it by declaring auth
server capability (see section Server Capabilities):
capability auth; |
User authentication database keeps, for each user name, the corresponding plain text password, and, optionally, names of the groups this user belongs to. Notice, that due to the specifics of DICT authentication scheme (see section The AUTH Command), user passwords are stored in plain text, therefore special care must be taken to protect the contents of your authentication database from compromise.
The database is defined using user-db block statement:
Declare user authentication database.
Dico's authentication is designed so that various authentication database formats may easily be added. A database is identified by its URL, or Universal Resource Locator. It consists of the following parts (square brackets denoting optional ones):
type://[[user[:password]@]host]/path[params] |
A database type, or format. See below for the list of available database formats.
User name necessary to access the database.
User password necessary to access the database.
Domain name or IP address of a machine running the database.
A path to the database. The exact meaning of this element depends on the database protocol. It is described in detail when discussing particular database protocols.
A list of protocol-dependent parameters. Each parameter is of the
form keyword=name, multiple parameters are separated
with semicolons.
If the underlying mechanism requires some additional configuration data that cannot be supplied using URL, these are passed to it using the following statement:
The argument is treated as an opaque string and passed to the authentication ‘open’ procedure verbatim. Its exact meaning depends on the type of the database.
The URL defines how the database is accessed. Another important point is where to get user data from. This is specified by the following two sub-statements:
Database resource returning user password.
Database resource returning user groups.
The exact semantics of database resource depends on the type of database being used. For flat text databases, resource means the name of a text file that contains these data, for SQL databases, resource is an SQL query, etc. Below we will discuss URLSs and resources used by each database type.
To summarize, the definition of an authentication database is:
# Define user database for authentication. user-db url { # Additional configuration options. options string; # Name of a password resource. password-resource resource; # Name of a resource returning user group information. group-resource resource; } |
A text authentication database consists of one or two flat text files — a password file, which contains user passwords, and a group file, which contains user groups. The latter is optional. Both files have the same format:
Record keys in a password file must be unique, i.e. no two records may contain the same first field. Group file may contain multiple records with the same key. For example:
$ grep smith pass smith guessme $ grep smith group smith user smith timing smith tester |
This means that user ‘smith’ has password ‘guessme’ and is a member of three groups: ‘user’, ‘timing’ and ‘tester’.
A URL of a text database begins with ‘text’ and
contains only path element, which gives the name of the
directory where the database files reside. The name of a password
file is given by the password-resource statement. The name of a
group file is given by the group-resource statement.
For example, if user passwords are kept in file ‘passwd’ and user groups are kept in file ‘user’, and both files reside in ‘/var/db/dico’ directory, then the appropriate database configuration will be:
user-db text:///var/db/dico {
password-resource passwd;
group-resource group;
}
|
To configure LDAP user database, you need first to load the ‘ldap’ module (see section LDAP module):
load-module ldap {
command "ldap";
}
|
The URL of the database is: ‘ldap://host[:port]’, where host is the host name or IP address of the LDAP server, and option port specifies the port number it is listening on (by default, port 389 is assumed).
The password-resource statement specifies the name of an
attribute containing the password, and the group-resource
supplies the name of the attribute with group name.
Additional configuration data are supplied in the options
statement, whose argument is a whitespace-separated list of
assignments:
Sets base DN.
Sets the DN to bind as.
Sets the password.
When set to ‘yes’, enables the use of TLS encryption.
Sets OpenLDAP debug level.
A LDAP filter to select the objects describing given user. Any occurrence of ‘$user’ in filter is replaced with the actual user name obtaining during the authentication. Variable expansion occurs much the same way as in the shell. In particular, the variable is expanded only unless it is immediately followed by an alphanumeric character. For example, it occurs in:
(uid=$user) |
and
(uid=$user.1) |
But it does not occur in
(uid=$users) |
If it is necessary to expand the variable in such a context, enclose its name in curly braces:
(uid=${user}s)
|
A LDAP filter that selects the user groups. The
filter is expanded as in user-filter.
The following example shows a LDAP user database configured for base DN ‘example.com’ which uses ‘posixAccount’ and ‘posixGroup’ objects from ‘nis.schema’:
user-db "ldap://localhost" {
password-resource userPassword;
group-resource cn;
options "user-filter=(uid=$user) "
"group-filter=(&(objectClass=posixGroup)(memberuid=$user)) "
"base=dc=example,dc=com";
}
|
A note on password usage is in order here. Most authentication methods require the passwords to be stored in the database in plain text form. The use of encrypted passwords (e.g. MD5 or SHA1) is possible only with ‘LOGIN’ and ‘PLAIN’ GSASL authentication methods.
Access control lists, or ACLs for short, are lists of
permissions that can be applied to certain dicod objects.
They can be used to control who can connect to the dictionary server
and what resources are offered to whom.
An ACL is defined using acl block statement:
acl name {
definitions
}
|
The name parameter specifies a unique name for that ACL. This name will be used by another configuration statements (See section Security Settings, and see section Database Visibility) to refer to that ACL.
A part between the curly braces (denoted by definitions above), is a list of access statements. There are two types of such statements:
Allow access to resource.
Deny access to resource.
All parts of an access statement are optional, but at least one of them must be present.
The user-group part specifies which users match this entry. Allowed values are the following:
allAll users.
authenticatedOnly authenticated users.
group group-listAuthenticated users which are members of at least one of groups listed in group-list.
The sub-acl part, if present, allows to branch to another ACL. The syntax of this group is:
acl name |
where name is the name of a previously defined ACL.
Finally, the host-list group allows to match client addresses.
It consists of a from keyword followed by a list of
address specifiers. Allowed address specifiers are:
Matches if the client IP equals addr. The latter may be given either as an IP address or as a host name, in which case it will be resolved and the first of its IP addresses will be used.
Matches if first netlen bits from the client IP address equal to addr. The network mask length, netlen must be an integer number in the range from 0 to 32. The address part, addr, is as described above.
The specifier matches if the result of logical AND between the client IP address and netmask equals to addr. The network mask must be specified in “dotted quad” form, e.g. ‘255.255.255.224’.
Matches if connection was received from a UNIX socket filename, which must be given as an absolute file name.
To summarize, the syntax of an access statement is:
allow|deny [all|authenticated|group group-list]
[acl name] [from addr-list]
|
where square brackets denote optional parts and vertical bar means ‘one of’.
When an ACL is applied to a particular object, its entries
are tried in turn until one of them matches, or the end of the list is
reached. If a matched entry is found, its command verb, allow
or deny, defines the result of ACL match. If the end
of list is reached, the result is ‘allow’, unless explicitly
specified otherwise.
For example, the following statement defines an ACL named ‘common’, that allows access for any user connected via local UNIX socket ‘/tmp/dicod.sock’ or coming from a local network ‘192.168.10.0/24’. Any authenticated users are allowed, provided that they are allowed by another ACL ‘my-nets’ (which should have been defined before this definition). Users coming from the network ‘10.10.0.0/24’ are allowed if they authenticate themselves and are members of groups ‘dicod’ or ‘users’. Access is denied for anybody else:
acl common {
allow all from ("/tmp/dicod.sock", "192.168.10.0/24");
allow authenticated acl "my-nets";
allow group ("dicod", "users") from "10.10.0.0/24";
deny all;
}
|
See section Security Settings, for information on how to control daemon security settings.
See section Database Visibility, for a detailed description on how to use ACLs to control access to databases.
This subsection describes configuration settings that control access
to various resources served by dicod.
Use ACL acl-name to control incoming connections. The ACL itself must be defined before this statement. Using user-group (see previous subsection) in this ACL makes no sense, because authentication is performed after connection is established.
acl incoming-conn {
allow from 213.130.0.0/19;
deny any;
}
connection-acl incoming-conn;
|
This statement controls whether to show system information in reply
to SHOW SERVER command (see section SHOW SERVER). The
information will be shown only if ACL acl-name allows it.
The system information shown includes the following data: name of the package and its version, name of the system where it was built and the kernel version thereof, host name, total operational time of the daemon, number of subprocesses executed so far and average usage frequency. For example:
dicod (dico 2.1) on Linux 2.6.24.4, dict.example.net up 110+04:42:58, 19647044 forks (6867.9/hour) |
The directive described in this subsection provide basic logging capabilities.
Prefix syslog messages with this string. By default, the program name is used.
Set syslog facility to use. Allowed values are: ‘user’, ‘daemon’, ‘auth’, ‘authpriv’, ‘mail’, ‘cron’, ‘local0’ through ‘local7’ (case-insensitive), or a facility number.
Prefix diagnostics messages with a string identifying their severity.
Log session transcript. The lines received from client are prefixed with ‘C:’, those sent in reply are marked with ‘S:’. Here is an excerpt from the transcript output:
S: 220 Trurl.gnu.org.ua dicod (dico 1.99.90) <mime.xversion> <1645.1212874507@Trurl.gnu.org.ua> C: client ``Kdict'' S: 250 ok C: show db S: 110 16 databases present S: afr-deu ``Afrikaans-German Freedict dictionary'' S: afr-eng ``Afrikaans-English FreeDict Dictionary'' [...] S: . S: 250 ok |
This option produces lots of output and can significantly slow down
the server. Use it only if you are debugging dicod or
some remote client. Never use it in a production environment.
GNU Dico provides a feature similar to Apache's CustomLog, which
allows to keep a log of MATCH and DEFINE requests. To
enable this feature, specify the name of the log file using the
following directive:
Set access log file name.
access-log-file /var/log/dico/access.log; |
The format of log file entries is specified using
access-log-format directive:
Set format string for access log file.
Its argument can contain literal characters, which are copied into the log file verbatim, and format specifiers, i.e. special sequences which begin with ‘%’ and are replaced in the log file as shown in the table below.
The percent sign.
Remote IP-address.
Local IP-address.
Size of response in bytes.
Size of response in bytes in CLF format, i.e. a ‘-’ rather than a ‘0’ when no bytes are sent.
Remote client (from CLIENT command (see section The CLIENT Command).
The time taken to serve the request, in microseconds.
Request command verb in abbreviated form, suitable for use in
URLs, i.e. ‘d’ for DEFINE, and ‘m’ for
MATCH. See section DICT URL.
Remote host.
Request command verb (DEFINE or MATCH).
Remote logname (from identd, if supplied). This will return a
dash unless identity-check is set to true.
See identity-check.
The search strategy.
The canonical port of the server serving the request.
The PID of the child that serviced the request.
The database from the request.
Full request.
The nth token from the request (n is 0-based).
Reply status. For multiple replies, the form ‘%s’ returns the status of the first reply, while ‘%>s’ returns that of the last reply.
Time the request was received in the standard Apache format, e.g.:
[04/Jun/2008:11:05:22 +0300] |
The time, in the form given by format, which should be a valid
strftime format. See section Time and Date Formats, for a detailed
description.
The standard ‘%t’ format is equivalent to
[%d/%b/%Y:%H:%M:%S %z] |
The time taken to serve the request, in seconds.
Remote user from AUTH command.
The host name of the server serving the request. See hostname directive.
Actual host name of the server (in case it was overridden in configuration).
The word from the request.
For the reference, here is the list of format specifiers that
have different meaning than in Apache: ‘%C’, ‘%H’, ‘%m’,
‘%q’. The following format specifiers are unique to dicod:
‘d’, ‘%{n}R’, ‘%V’, ‘%W’.
Absence of the access-log-format directive is equivalent to
the following statement:
access-log-format "%h %l %u %t \"%r\" %>s %b"; |
It was chosen so as to be compatible with Apache access logs and
be easily parsable by existing log analyzing tools, such as
webalizer.
Extending this format string with the client name produces a log format similar to Apache ‘combined log’:
access-log-format "%h %l %u %t \"%r\" %>s %b \"\" \"%C\""; |
The settings in this subsection configure basic behavior of the DICT daemon.
Display the string in the textual part of the initial server reply.
When connection is established, the server sends an initial reply to the client, that looks like in the example below:
220 Trurl.gnu.org.ua <auth.mime> <520.1212912026@Trurl.gnu.org.ua> |
See section Initial Reply, for a detailed description of its parts.
The part of this reply after the host name and the first angle
bracket is modifiable and can contain arbitrary text. You can use
initial-banner-text to append any additional information
there. Note, that string may not contain newlines. For
example:
initial-banner-text "Please authenticate yourself,"; |
This statement produces the following initial reply (split over two lines for readability):
220 Trurl.gnu.org.ua Please authenticate yourself, <auth.mime> <520.1212912026@Trurl.gnu.org.ua> |
Set the hostname. By default, the server determines it automatically. If, however, it makes a wrong guess, you can fix it using this directive.
The server hostname is used, among others, in the initial reply after ‘220’ code (see above) and may also be displayed in the access log file using ‘%v’ escape (see section Access Log).
Set server description to be shown in reply to SHOW SERVER
(see section SHOW SERVER) command.
The first line of the reply, after the usual ‘114’ response line,
shows the name of host where the server is running. If the settings
of show-sys-info (see section show-sys-info) allow, some
additional information about the system is printed.
The lines that follow are taken from the server-info
directive. It is common to specify string using
“here-document” syntax (see here-document), e.g.:
server-info <<EOT Welcome to the FOO dictionary service. Contact <dict@foo.org> if you have questions or suggestions. EOT; |
Set the text to be displayed in reply to the HELP command.
The default reply to HELP command displays a list of commands understood by the server with a short description of each.
You can use help-text directive to append arbitrary
text to that output, provided that you begin string with a
plus sign, e.g.:
help-text <<-EOT + The commands beginning with an X are extensions. EOT; |
If string begins with any character, except ‘+’, it will replace the default help output. For example:
help-text <<-EOT There is no help. See RFC 2229 for detailed information. EOT; |
Set the name of the default matching strategy (see section The MATCH Command). By default, Levenshtein matching is used, which is equivalent to
default-strategy lev; |
Capabilities are certain server features that can be enabled or disabled at the system administrator's will.
Request additional capabilities from list.
The argument to capability directive must contain names
of existing dicod capabilities. These are listed in the
following table:
The AUTH command is supported. See section Authentication.
The OPTION MIME command is supported. Notice that
RFC 2229 requires all servers to support that command, so
you should always specify this capability.
The XVERSION command is supported. It is a GNU extension that
displays the dicod implementation and version number.
See section XVERSION.
The XLEV command is supported. This command allows to set and
query maximal Levenshtein distance for lev matching strategy.
See section strategy. See section XLEV.
Capabilities set using the capability directive are
displayed in the initial server reply (see initial reply), and
appropriate entries are added to the HELP command output.
A database module is an external piece of software designed to
handle a particular format of dictionary databases. This piece of
software is built as a shared library so that dicod can load
at run time.
A handler is an instance of a database module loaded by
dicod and configured for a specific database or a set of
databases.
Database handlers are defined using the following block statement:
Create an instance of a database module. The argument specifies a unique name
which will be used by subsequent parts of the configuration to refer to this
handler. The handler statement is a block statement. The only
sub-statement allowed within it is command statement:
Set the command line for this handler. It is similar to shell's command line: it consists of a name of database module, optionally followed by a whitespace-separated list of its arguments. Just as in shell, the name of the module specifies the disk file which should be loaded. Arguments are passed to the module initialization function (see dico_init).
For example:
load-module dict {
command "dictorg dbdir=/var/dicodb";
}
|
This statement defines a handler named ‘dict’, which loads the module ‘dictorg’ and passes its initialization function a single argument, ‘dbdir=/var/dicodb’. If the module name is not an absolute file name, as in this example, the loadable module will be searched in the module load path.
A module load path is an internal list of directories which
dicod scans in order to find a loadable file name specified
in command statement of a load-module block. By default the
order of search is as follows:
prepend-load-path directive (see below).
module-load-path directive (see below).
LTDL_LIBRARY_PATH.
LD_LIBRARY_PATH.
The value of LTDL_LIBRARY_PATH and LD_LIBRARY_PATH must be a
colon-separated list of absolute directories, for example,
‘/usr/lib/mypkg:/lib/foo’.
In any of these directories, dicod first attempts to find and
load the given filename. If this fails, it tries to append the
following suffixes to it:
This directive adds the directories listed in its argument to the module load path. Example:
module-load-path (/usr/lib/dico,/usr/local/dico/lib); |
Same module-load-path, but adds directories to the beginning
of the module load path. Example:
Dictionary databases are defined using database block
statement.
Define a dictionary database. At least two sub-statements must be
defined for each database: name and handler.
Set the name of this database (a single word). This name will be used to identify this database in DICT commands.
Specify the name of a handler for this database and any arguments for
it. This handler must be previously defined using load-module
statement (see section Database Modules and Handlers).
For example, the following fragment defines a database named
‘en-de’, which is handled by ‘dictord’ handler. The handler
is passed one argument, database=en-de:
database {
name "en-de";
handler "dictorg database=en-de";
}
|
More directives are available to fine-tune the database.
Supply a short description, to be shown in reply to SHOW DB
command. The string may not contain new-lines.
Use this statement if the database itself does not supply a description, or if its description is malformed.
In any case, if description directive is specified, its value
takes precedence over description string retrieved from the database
itself.
See section SHOW DB, for a description of SHOW DB command.
Supply a full description of the database. This description is shown
in reply to SHOW INFO (see section SHOW INFO) command. The
string is usually a multi-line text, so it is common to use
here-document syntax (see here-document), e.g.:
info <<- EOT This is a foo-bar dictionary. Copyright (C) 2008 foo-bar dict group. Distributed under the terms of GNU Free Documentation license. EOT; |
Use this statement if the database itself does not supply a full description, or if its full description is malformed.
As with description, the value of info takes precedence
over info strings retrieved from the database.
The following two directives control the content type and transfer
encoding used when formatting replies from this database if
OPTION MIME (see section OPTION MIME) is in effect:
Set the content type of the reply. E.g.:
directory {
name "foo";
handler "dictorg";
content-type "text/html";
...
}
|
Set transfer encoding to use when sending MIME replies for this database. Allowed values for enum are:
Use BASE64 encoding.
Use quoted-printable encoding.
A property called database visibility is associated with each
dictionary database. It determines whether the database appears in
the output of SHOW DB command, and takes part in dictionary
searches.
By default, all databases are defined as publicly visible. You can, however, abridge their visibility on global as well as on per-directory basis. This can be achieved using visibility ACLs.
In general, the visibility of a database is controlled by two access control lists: global visibility ACL and database visibility ACL. The latter takes precedence over the former.
Both ACLs are defined using visibility-acl statement:
Set name of an ACL controlling database visibility. If used
in global scope, this statement sets global visibility ACL.
If used within a database block, it sets visibility
ACL for that particular database.
Consider the following example:
acl glob-vis {
allow authenticated;
deny all;
}
acl local-nets {
allow from (192.168.10.0/24, /tmp/dicod.sock);
}
visibility-acl glob-vis;
database {
name "terms";
visibility-acl local-nets;
}
|
In this configuration, the ‘terms’ database is visible to everybody coming from the ‘192.168.10.0/24’ network and from the UNIX socket ‘/tmp/dicod.sock’, without authorization. It is not visible to users coming from elsewhere, unless they authenticate themselves.
Default search is a MATCH request with ‘*’ or
‘!’ as database argument (see section The MATCH Command). The former means
search in all available databases, the latter means search in all
databases until a match is found.
Default searches may be quite expensive and may cause considerable
strain on the server. For example, the command MATCH * priefix
"" returns all entries from all available database, which would
consume a lot of resources both on the server and on the client side.
To minimize harmful effects from such potentially dangerous requests, Dico allows to limit the use of certain strategies in default searches.
Restrict the use of strategy name in default searches.
The statements define conditions the 4th argument of a
MATCH command must match in order to deny the request. The
following statements are defined:
Unconditionally deny this strategy in default searches.
Deny this strategy if the search word matches one of the words from list.
Deny if length of the search word is less than number.
Deny if length of the search word is less than or equal to number.
Deny if length of the search word is greater than number.
Deny if length of the search word is greater than or equal to number.
Deny if length of the search word is equal to number.
Deny if length of the search word is not equal to number.
For example, the following statement denies the use of ‘prefix’ strategy in default searches if its argument is an empty string:
strategy prefix {
deny-length-eq 0;
}
|
If the dicod daemon is configured this way, it will always return
a ‘552’ reply on commands MATCH * prefix "" or MATCH
! prefix "". However, use of empty prefix on a concrete database, as
in MATCH eng-deu prefix "", will still be allowed.
While tuning your server, it is often necessary to get timing
information which shows how much time is spent serving certain
requests. This can be achieved using timing configuration
directive:
Provide timing information after successful completion of an
operation. This information is displayed after the following
requests: MATCH, DEFINE, and QUIT. It consists
of the following parts:
[d/m/c = nd/nm/nc RTr UTu STs] |
where:
Number of processed define requests. It is ‘0’ after a
MATCH request.
Number of processed match requests. It is ‘0’ after a
DEFINE request.
Number of comparisons made. This value may be inaccurate if the underlying database module is not able to count comparisons.
Real time spent serving the request.
Time in user space spent serving the request.
Time in kernel space spent serving the request.
An example of a server reply with timing information follows:
250 Command complete [d/m/c = 0/63/107265 2.293r 1.120u 0.010s] |
You can also add timing information to your access log files, see %T.
Aliases allow a string to be substituted for a word when it is used
as the first word of a command. The daemon maintains a list of
aliases that are created using the alias configuration file
statement:
Create a new alias.
Aliases are useful to facilitate manual interaction with the server,
as they allow to create abbreviations for some frequently typed
commands. For example, the following alias creates new command
d which is equivalent to DEFINE *:
alias d DEFINE "*"; |
Aliases may be recursive, i.e. the first word of command may refer to another alias. For example:
alias d DEFINE; alias da d "*"; |
This configuration will produce the following expansion:
da word ⇒ DEFINE * word |
To prevent endless loops, recursive expansion is stopped if the first word of the replacement text is identical to an alias expanded earlier.
Before parsing configuration file, dicod preprocesses
it. The built-in preprocessor handles only file inclusion
and #line statements (see section Pragmatic Comments), while the
rest of traditional preprocessing facilities, such as macro expansion,
is supported via m4, which is used as an external preprocessor.
The detailed description of m4 facilities lies far beyond
the scope of this document. You will find a complete user manual in
http://www.gnu.org/software/m4/manual.
For the rest of this subsection we assume the reader is sufficiently
acquainted with m4 macro processor.
The external preprocessor is invoked with ‘-s’ flag, instructing it to include line synchronization information in its output. This information is then used by the parser to display meaningful diagnostic. An initial set of macro definitions is supplied by the ‘pp-setup’ file, located in ‘$prefix/share/dico/version/include’ directory (where version means the version of GNU Dico package).
The default ‘pp-setup’ file renames all m4 built-in
macro names so they all start with the prefix ‘m4_’. This
is similar to GNU m4 ‘--prefix-builtin’ options, but has an
advantage that it works with non-GNU m4 implementations as
well.
As an example of how the use of preprocessor may improve
dicod configuration, consider the following fragment taken
from one of the installations of GNU Dico. This installation offers quite
a few Freedict dictionaries. The database definition for each of them
is almost the same, except for the dictionary name and eventual
description entry for several databases that miss it. To avoid
repeating the same text over again, we define the following macro:
# defdb(NAME[, DESCR]) # Produce a standard definition for a database NAME. # If DESCR is given, use it as a description. m4_define(`defdb', ` database { name "$1"; handler "dictorg database=$1";m4_dnl m4_ifelse(`$2',,,` description "$2";') } ') |
It takes two arguments. The first one, NAME defines the dictionary
name visible in the output of SHOW DB command. Optional second
argument may be used to supply a description string for the databases
that miss it.
Given this macro, the database definitions look like:
defdb(eng-swa) defdb(swa-eng) defdb(afr-eng, Afrikaans-English Dictionary) defdb(eng-afr, English-Afrikaans Dictionary) |
Apart from issuing a descriptive error message, dicod
attempts to indicate the reason of its termination by its error code.
As usual, zero exit code indicates normal termination. The table
below summarizes all possible error codes. For each error code, it
indicates its decimal value and its symbolic name from
‘include/sysexits.h’ (if available).
Program terminated correctly.
Only child instances of dicod exit with this code. It
indicates that the child did not receive any ‘DICT’ command
within a time out interval (see inactivity-timeout).
The program was invoked incorrectly, e.g. an invalid option was given, or an option was supplied erroneous argument.
Dicod cannot switch to the privileges of the user it is
configured to run as (see user statement).
Server exited due to some error not otherwise described in this table.
Some internal software error occured.
Some system error occurred, e.g. the program ran out of memory or file descriptors, ‘fork’ failed, etc.
An error in the configuration file was detected.
This section summarizes dicod command line options.
Read this configuration file instead of the default ‘$sysconfdir/dicod.conf’. See section Configuration.
Operate in foreground. See section Daemon Operation Mode.
Output diagnostic to stderr. See section –stderr.
After successful startup, output any diagnostic to syslog. This is the default.
Preprocess configuration file and exit. See section Using Preprocessor to Improve the Configuration..
Use prog as a preprocessor for configuration file. The default
preprocessor command line is m4 -s, unless overridden while
configuring the package (see section Default Preprocessor).
See section Using Preprocessor to Improve the Configuration..
Do not use external preprocessor. See section Using Preprocessor to Improve the Configuration..
Add the directory dir to the list of directories to be searched for preprocessor include files. See section Using Preprocessor to Improve the Configuration..
In daemon mode, process connections in the main process, without starting subprocesses for each connection (see section Daemon Operation Mode). This means that the daemon is able to serve only one client at a time. The ‘--single-process’ option is provided for debugging purposes only. Never use it in production environment.
Enable session transcript. This instructs dicod to log
all commands it receives and all responses it sends during the
session. Transcript is logged via the default logging channel
(see section Logging and Debugging). If logging via syslog, the
‘debug’ priority is used.
See also Session Transcript, for a description of the similar
mode in dico, the client program.
Disable transcript mode. This is the default. Use this option if you wish to temporarly disable transcript mode, enabled in the configuration file (see section transcript).
Run in inetd mode. See section Inetd Operation Mode.
Set debug verbosity level. The level argument is an integer ranging from ‘0’ (no debugging) to ‘100’ (maximum debugging information).
Include source line information in the debugging output.
Trace parsing of the config file. The option is provided for debugging purposes.
Trace config file lexer. The option is provided for debugging purposes.
Show configuration file summary. See section Configuration.
Check configuration file syntax and exit with code ‘0’ if it is OK, or with ‘78’ if there are errors. See section Configuration.
Display a short command line option summary and exit.
List all available command line options and exit.
Print program version and exit.
![]() |
![]() |
![]() |
![]() |
![]() |
? |
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.