|
|
| |
DICOD.CONF(5) |
GNU Dico Reference |
DICOD.CONF(5) |
dicod.conf - GNU dictionary server configuration file.
The file /etc/dicod.conf contains configuration settings and database
definitions for the GNU dictionary server dicod(8). The server reads
this file once, upon startup, and uses the settings until it is shut down or
the HUP signal is delivered, in which case previous configuration settings are
discarded and the file is re-read.
This manpage is a short description of the dicod.conf configuration file.
For a detailed discussion, including examples and usage recommendations, refer
to the GNU Dico Manual available in texinfo format. If the info
reader and GNU Dico documentation are properly installed on your
system, the command
info dico
should give you access to the complete manual.
You can also view the manual using the info mode in
emacs(1), or find it in various formats online at
http://www.gnu.org.ua/software/dico/manual
If any discrepancies occur between this manpage and the GNU
Dico Manual, the later shall be considered the authoritative source.
There are three classes of lexical tokens: words, quoted strings, and
separators. Blanks, tabs, newlines and comments, collectively called white
space are ignored except as they serve to separate tokens. Some white
space is required to separate otherwise adjacent keywords and values.
A word is a sequence of letters, digits, and any of the following
characters: _, -, ., /, @, *,
:, [, ].
A quoted string is any sequence of characters enclosed in double-quotes
("). A backslash appearing within a quoted string introduces an
escape sequence, which is replaced with a single character according to
the following rules:
Sequence Expansion ASCII
\\ \ 134
\" " 042
\a audible bell 007
\b backspace 010
\f form-feed 014
\n new line 012
\r charriage return 015
\t horizontal tabulation 011
\v vertical tabulation 013
In addition, the sequence \newline is removed from
the string. This allows to split long strings over several physical lines,
e.g.:
"a long string may be\
split over several lines"
If the character following a backslash is not one of those
specified above, the backslash is ignored and a warning is issued.
Two or more adjacent quoted strings are concatenated, which gives
another way to split long strings over several lines to improve readability.
The following fragment produces the same result as the example above:
"a long string may be"
" split over several lines"
A here-document is a special construct that allows to
introduce strings of text containing embedded newlines.
The <<word construct instructs the parser to
read all the following lines up to the line containing only word,
with possible trailing blanks. Any lines thus read are concatenated together
into a single string. For example:
<<EOT
A multiline
string
EOT
The body of a here-document is interpreted the same way as a
double-quoted string, unless word is preceded by a backslash (e.g.
<<\EOT) or enclosed in double-quotes, in which case the text is
read as is, without interpretation of escape sequences.
If word is prefixed with - (a dash), then all
leading tab characters are stripped from input lines and the line containing
word. Furthermore, - is followed by a single space, all
leading whitespace is stripped from them. This allows to indent
here-documents in a natural fashion. For example:
<<- TEXT
The leading whitespace will be
ignored when reading these lines.
TEXT
It is important that the terminating delimiter be the only token
on its line. The only exception to this rule is allowed if a here-document
appears as the last element of a statement. In this case a semicolon can be
placed on the same line with its terminating delimiter, as in:
help-text <<-EOT
A sample help text.
EOT;
The usual comment styles are supported:
C style: /* */
C++ style: // to end of line
Unix style: # to end of line
Pragmatic comments are similar to the usual single-line
comments, except that they cause some changes in the way the configuration
is parsed. Pragmatic comments begin with a # sign and end with the
next physical newline character.
- #include <FILE>
- #include FILE
- Include the contents of the file file. Both forms are equivalent.
The FILE must be an absolute file name.
- #include_once <FILE>
- #include_once FILE
- Same as #include, except that, if the FILE has already been
included, it will not be included again.
- #line num
- #line num "FILE"
- This line causes the parser to believe, for purposes of error diagnostics,
that the line number of the next source line is given by num and
the current input file is named by FILE. If the latter is absent,
the remembered file name does not change.
- # num "FILE"
- This is a special form of the #line statement, understood for
compatibility with the C preprocessor.
A simple statement consists of a keyword and value separated by any
amount of whitespace. Some statements take more than one value. Simple
statement is terminated with a semicolon (;).
The following is a simple statement:
pidfile /var/run/direvent.pid;
See below for a list of valid simple statements.
A value can be one of the following:
- number
- A number is a sequence of decimal digits.
- boolean
- A boolean value is one of the following: yes, true, t
or 1, meaning true, and no, false, nil,
0 meaning false.
- word
- quoted string
- list
- A comma-separated list of values, enclosed in parentheses.
A block statement introduces a logical group of statements. It consists of a
keyword, followed by an optional value, called a tag, and a sequence of
statements enclosed in curly braces, as shown in the example below:
acl global {
allow all from 198.51.100.0/24;
deny all;
}
The closing curly brace may be followed by a semicolon, although this is not
required.
- user NAME
- Run with the privileges of this user. The argument is either a user name,
or UID prefixed with a plus sign.
- group LIST
- If the user statement is present, dicod will drop all
supplementary groups and switch to the principal group of that user.
Sometimes, however, it may be necessary to retain one or more
supplementary groups. For example, this might be necessary to access
dictionary databases. The group statement retains the supplementary
groups listed in LIST. Each group can be specified either by its
name or by its GID number, prefixed with @samp{+}, e.g.:
user nobody;
group (man, dict +88);
This statement is ignored if no user statement is present or if
dicod is running in inetd mode.
- mode daemon|inetd
- Sets server operation mode.
- listen LIST
- Specify the IP addresses and ports to listen on in daemon mode. By
default, dicod will listen on port 2628 on all existing interfaces.
Elements of LIST can have the following forms:
- HOST:PORT
- Specifies an IP (version 4 or 6) socket to listen on. The HOST part
is either an IPv4 in ``dotted-quad'' notation, or an IPv6 address in
square brackets, or a host name. In the latter case, dicod will
listen on all IP addresses corresponding to its A and AAAA
DNS records.
The PORT part is either a numeric port number or a
symbolic service name from the /etc/services file.
Either of the two parts may be omitted. If HOST is
omitted, the server will listen on all interfaces. If PORT is
omitted, the default port 2628 will be used.
- inet://HOST:PORT,
inet4://HOST:PORT
- Listen on IPv4 socket. HOST is either an IP address or a host name.
In the latter case, dicod will start listening on all IP addresses
from the A records for this host.
Either HOST or PORT (but not both) can be
omitted. Missing HOST defaults to IPv4 addresses on all available
network interfaces, and missing PORT defaults to 2628.
- inet6://HOST:PORT
- Listen on IPv6 socket. HOST is either an IPv6 address in square
brackets, or a host name. In the latter case, dicod will start
listening on all IP addresses from the AAAA records for this host.
Either HOST or PORT (but not both) can be
omitted. Missing HOST defaults to IPv6 addresses on all available
network interfaces, and missing PORT defaults to 2628.
- FILENAME, unix://FILENAME
- Specifies the name of a UNIX socket to listen on. FILENAME must be
an absolute file name of the socket.
- pidfile STRING
- Store PID of the master process in this file. Default is
/var/run/dicod.pid.
- max-children NUMBER
- Sets maximum number of subprocesses that can run simultaneously. This is
equivalent to the number of clients that can simultaneously use the
server. The default is 64.
- inactivity-timeout NUMBER
- Sets inactivity timeout to the NUMBER of seconds. The server
disconnects automatically if the remote client has not sent any command
within this number of seconds. Setting timeout to 0 disables inactivity
timeout (the default).
This statement along with max-children allows you to
control the server load.
- shutdown-timeout NUMBER
- When the master server is shutting down, wait this number of seconds for
all children to terminate. Default is 5 seconds.
- identity-check BOOLEAN
- Enable identification check using AUTH protocol (RFC 1413). The received
user name or UID can be shown in access log using the %l conversion
(see below).
- ident-keyfile STRING
- Use encryption keys from the named file to decrypt AUTH replies encrypted
using DES.
- ident-timeout NUMBER
- Set timeout for AUTH input/output operation to NUMBER of seconds. Default
timeout is 3 seconds.
The authentication database is defined as:
user-db URL {
# Additional configuration options.
options STRING;
# Name of the password resource.
password-resource RESOURCE;
# Name of the resource returning user group information.
group-resource RESOURCE;
}
The URL consists of the following parts (square brackets
denoting optional ones):
TYPE://[[USER[:PASSWORD]@]HOST]/PATH[PARAMS]
where:
- TYPE
- Database type. Two types are supported: text and ldap.
- USER
- User name, if necessary to access the database.
- PASSWORD
- User password, if necessary to access the database.
- HOST
- Domain name or IP address of a machine running the database.
- PATH
- A path to the database. The exact meaning of this element depends on the
database protocol. See the texinfo documentation.
- PARAMS
- A list of protocol-dependent parameters. Each parameter is of the form
KEYWORD=NAME, multiple parameters are separated with
semicolons.
The following statements can appear within the user-db
block:
- options STRING
- Pass additional options to the underlying mechanism. The argument is
treated as an opaque string and passed to the authentication open
procedure verbatim. Its exact meaning depends on the type of the
database.
- password-resource ARG
- A database resource which returns the user's password.
- group-resource ARG
- A database resource which returns the list of groups this user is member
of.
The exact semantics of the database resource depends on the
type of database being used. For flat text databases, it means the name of a
text file that contains these data, for LDAP databases, the resource is the
filter string, etc. Please refer to the GNU Dico Manual, subsection
4.3.3 Authentication for a detailed discussion.
The SASL authentication is available if the server was compiled with GNU SASL.
It is configured using the following statement:
sasl {
# Disable SASL mechanisms listed in MECH.
disable-mechanism MECH;
# Enable SASL mechanisms listed in MECH.
enable-mechanism MECH;
# Set service name for GSSAPI and Kerberos.
service NAME;
# Set realm name for GSSAPI and Kerberos.
realm NAME;
# Define groups for anonymous users.
anon-group GROUPS;
}
- disable-mechanism MECH
- Disable SASL mechanisms listed in MECH, which is a list of
names.
- enable-mechanism MECH
- Enable SASL mechanisms listed in MECH, which is a list of
names.
- service NAME
- Sets the service name for GSSAPI and Kerberos mechanisms.
- realm NAME
- Sets the realm name.
- anon-group LIST
- Declares the list of user groups considered anonymous.
Define an ACL:
acl NAME {
DEFINITION...
}
The parameter NAME assigns a unique name to that ACL. This
name will be used by another configuration statements to refer to that ACL
(see SECURITY SETTINGS, and Database Visibility).
Each DEFINITION is:
allow|deny [all|authenticated|group GROUPLIST] [acl NAME] [from ADDRLIST]
A definition starting with allow allows access to the
resource, and the one starting with deny denies it.
The next part controls what users have access to the resource:
- all
- All users (the default).
- authenticated
- Only authenticated users.
- group GROUPLIST
- Authenticated users which are members of at least one of the groups listed
in GROUPLIST.
The acl part refers to an already defined ACL.
The from keyword declares that the client IP must be within
the ADDRLIST in order for the definition to apply. Elements of
ADDRLIST are:
- any
- Matches any client address.
- IP address
- Matches if the request comes from the given IP (both IPv4 and IPv6 are
allowed).
- ADDR/NETLEN
- Matches if first NETLEN bits from the client IP address equal to
ADDR. The network mask length, NETLEN must be an integer number
between 0 and 32 for IPv4, and between 0 and 128 for IPv6. The address
part, ADDR, is as described above.
- ADDR/NETMASK
- The specifier matches if the result of logical AND between the client IP
address and NETMASK equals to ADDR. The network mask must be
specified in a IP address (either IPv4 or IPv6) notation.
- connection-acl NAME
- Use ACL NAME to control incoming connections. The ACL itself must
be defined before this statement. Using the group clause in this
ACL makes no sense, because the authentication itself is performed only
after the connection have been established.
- show-sys-info NAME
- Controls whether to show system information in reply to SHOW
SERVER command. The information will be shown only if ACL
NAME allows it.
- visibility-acl NAME
- Sets name of the ACL that controls visibility of all databases.
- log-tag STRING
- Prefix syslog messages with this string. By default, the program name is
used.
- log-facility STRING
- Sets the syslog facility to use. Allowed values are: user,
daemon, auth, authpriv, mail, cron,
local0 through local7 (case-insensitive), or a decimal
facility number.
- log-print-severity BOOLEAN
- Prefix diagnostics messages with a string identifying their severity.
- transcript BOOLEAN
- Controls the transcript of user sessions.
GNU Dico provides a feature similar to Apache's CustomLog, which keeps a
log of MATCH and DEFINE requests.
- access-log-file STRING
- Sets access log file name.
- access-log-format STRING
- Defines the format string. Its argument can contain literal characters,
which are copied into the log file verbatim, and format
specifiers, i.e. special sequences beginning with %, which
are replaced in the log file as shown in the table below:
- %%
- The percent sign.
- %a
- Remote IP address.
- %A
- Local IP address.
- %B
- Size of response in bytes.
- %b
- Size of response in bytes in CLF format, i.e. a dash rather than a
0 when no bytes are sent.
- %C
- Remote client (from the CLIENT command).
- %D
- The time taken to serve the request, in microseconds.
- %d
- Request command verb in abbreviated form, suitable for use in URLs, i.e.
d for DEFINE, and m for MATCH.
- %h
- Remote host.
- %H
- Request command verb (DEFINE or MATCH).
- %l
- Remote logname (from identd(1), if supplied). This will return a
dash unless identity-check statement is set to true.
- %m
- The search strategy.
- %p
- The canonical port of the server serving the request.
- %P
- The PID of the child that served the request.
- %q
- The database from the request.
- %r
- Full request.
- %{N}R
- The Nth token from the request (N is 0-based).
- %s
- Reply status. For multiple replies, the form %s returns the status
of the first reply, while %>s returns that of the last
reply.
- %t
- Time the request was received in the standard Apache format, e.g.:
[04/Jun/2008:11:05:22 +0300]
- %{FORMAT}t
- The time, in the form given by FORMAT, which should be a valid
strftime(3) format string. The standard %t format is
equivalent to
[%d/%b/%Y:%H:%M:%S %z]
- %T
- The time taken to serve the request, in seconds.
- %u
- Remote user from AUTH command.
- %v
- The host name of the server serving the request.
- %V
- Actual host name of the server (in case it was overridden in
configuration).
- %W
- The word from the request.
The absence of access-log-format statement is equivalent to
the following:
access-log-format "%h %l %u %t \"%r\" %>s %b";
- initial-banner-text TEXT
- Display TEXT in the textual part of the initial server reply.
- hostname STRING
- Sets the hostname. By default it is determined automatically.
The server hostname is used, among others, in the initial
reply after the 220 and may also be displayed in the access log
file using the %v escape (see ACCESS LOG).
- server-info TEXT
- Sets the server description to be shown in reply to the SHOW
SERVER command.
It is common for TEXT to use the here-document
syntax, e.g.:
server-info <<EOT
Welcome to the FOO dictionary service.
Contact <dict@foo.example.org> if you have questions or
suggestions.
EOT;
- help-text TEXT
- Sets the text to be displayed in reply to the HELP command.
The default reply displays a list of commands understood by
the server with a short description of each.
If TEXT begins with a plus sign, it will be appended to
the default reply.
- default-strategy NAME
- Sets the name of the default matching strategy (*note MATCH::). By
default, Levenshtein matching is used, which is equivalent to
default-strategy lev;
- capability LIST
- Requests additional capabilities from the LIST.
Capabilities are certain server features that can be enabled or
disabled at the system administrator's will. The following capabilities are
defined:
- auth
- The AUTH command is supported. See the section
AUTHENTICATION, for its configuration.
- mime
- The OPTION MIME command is supported. Notice that RFC 2229
requires all servers to support that command, so you should always specify
this capability.
- xversion
- The XVERSION command is supported. It is a GNU extension that
displays the dicod implementation and version number.
- xlev
- The XLEV command is supported. This command allows the remote party
to set and query maximal Levenshtein distance for the lev matching
strategy.
The capabilities set using this directive are displayed in the
initial server reply, and their descriptions are added to the HELP
command output (unless specified otherwise by the help-text
statement).
A database module is an external piece of software designed to handle a
particular format of dictionary databases. This piece of software is built as
a shared library that `dicod' loads at run time.
A handler is an instance of the database module loaded by
dicod and configured for a specific database or a set of
databases.
Database handlers are defined using the following block
statement:
load-module NAME {
command CMD;
}
The load-module statement creates an instance of a database
module. The NAME argument specifies a unique name which will be used
by subsequent parts of the configuration to refer to this handler. The
command line for this handler is supplied with the command statement.
It must begin with the name of the module (without the library suffix) and
can contain any additional arguments. If the module name is not an absolute
file name, the module will be searched in the module load path.
For example:
load-module dict {
command "dictorg dbdir=/var/dicodb";
}
A simplified form of this statement:
load-module NAME;
is equivalent to:
load-module NAME {
command NAME;
}
A module load path is an internal list of directories which
dicod scans in order to find a loadable file name specified in the
command statement. By default the search order is as follows:
- 1.
- Optional prefix search directories specified in the
prepend-load-path statement (see below);
- 2.
- GNU Dico module directory /usr/local/lib/dico;
- 3.
- Additional search directories specified in the module-load-path
statement (see below);
- 4.
- The value of the environment variable LTDL_LIBRARY_PATH;
- 5.
- The system dependent library search path (e.g. on GNU/Linux it is defined
by the file /etc/ld.so.conf and the environment variable
LD_LIBRARY_PATH).
The value of LTDL_LIBRARY_PATH and LD_LIBRARY_PATH
must be a colon-separated list of absolute directory names.
In each of these directories, dicod first attempts to find
and load the given filename. If this fails, it tries to append the following
suffixes to it:
- 1.
- the libtool archive suffix .la;
- 2.
- the suffix used for native dynamic libraries on the host platform, e.g.,
.so, .sl, etc.
- module-load-path LIST
- Add directories from LIST to the end of the module load path.
- prepend-load-path LIST
- Add directories from LIST to the beginning of the module load
path.
database {
name WORD;
description STRING;
info TEXT;
languages-from LANGLIST;
languages-to LANGLIST;
handler NAME;
visibility-acl NAME;
mime-headers TEXT;
}
- name STRING
- Sets the name of this database (a single word). This name will be used to
identify this database in DICT commands.
- handler STRING
- Specifies the handler name for this database and optional arguments for
it. This handler must be previously defined using the load-module
statement (see above).
- description STRING
- Supplies a short description, to be shown in reply to the SHOW DB
command. The STRING may not contain newlines.
- info STRING
- Defines a full description of the database. This description is shown in
reply to the SHOW INFO command. It is usually a multi-line text, so
it is common to use here-document syntax.
- content-type STRING
- Sets the content type of the reply (for use in MIME headers).
- content-transfer-encoding VALUE
- Sets transfer encoding to use when sending MIME replies for this database.
VALUE is one of: base64, quoted-printable.
- visibility-acl NAME
- Sets name of the ACL that controls that database visibility.
A default search is a MATCH request with * or ! as
the database argument. The former means search in all available databases, and
the latter means search in all databases until a match is found.
Default searches cabd be quite expensive and can cause
considerable strain on the server. For example, the command MATCH *
priefix "" returns all entries from all available databases,
which would consume a lot of resources both on the server and on the client
side.
To minimize harmful effects from such potentially dangerous
requests, the following statement makes it possible to limit the use of
certain strategies in default searches:
strategy NAME {
deny-all BOOL;
deny-word CONDLIST;
deny-length-lt NUMBER;
deny-length-le NUMBER;
deny-length-gt NUMBER;
deny-length-ge NUMBER;
deny-length-eq NUMBER;
deny-length-ne NUMBER;
}
- deny-all BOOL
- Unconditionally deny the use of this strategy in default searches.
- deny-word LIST
- Deny this strategy if the search word matches one of the words from
LIST.
- deny-length-lt NUMBER
- Deny if length of the search word is less than NUMBER.
- deny-length-le NUMBER
- Deny if length of the search word is less than or equal to
NUMBER.
- deny-length-gt NUMBER
- Deny if length of the search word is greater than NUMBER.
- deny-length-ge NUMBER
- Deny if length of the search word is greater than or equal to
NUMBER.
- deny-length-eq NUMBER
- Deny if length of the search word is equal to NUMBER.
- deny-length-ne NUMBER
- Deny if length of the search word is not equal to NUMBER.
For example, the following statement denies the use of
prefix strategy in default searches if its argument is an empty
string:
strategy prefix {
deny-length-eq 0;
}
While tuning your server, it is often necessary to get timing information which
shows how much time is spent serving certain requests. This can be achieved
using the following configuration directive:
- timing BOOLEAN
- Provide timing information after successful completion of an
operation.
This information is displayed after replies to the following
requests: MATCH, DEFINE, and QUIT. The format is:
[d/m/c = ND/NM/NC RTr UTu STs]
where:
- ND
- Number of processed define requests.
- NM
- Number of processed match requests.
- NC
- Number of comparisons made. This value may be inaccurate if the underlying
database module does not provide such information.
- RT
- Real time spent serving the request.
- UT
- Time in user space spent serving the request.
- ST
- Time in kernel space spent serving the request.
You can also add timing information to your access log files. See
the %T conversuion in section ACCESS LOG.
Aliases allow a string to be substituted for a word when it is used as the first
word of a command. The daemon maintains a list of aliases that are created
using the alias configuration file statement:
- alias WORD COMMAND
- Creates a new alias.
Aliases may be recursive, i.e. the first word of COMMAND
may refer to another alias. To prevent endless loops, recursive expansion is
stopped if the first word of the replacement text is identical to an alias
expanded earlier.
Aliases are useful to facilitate manual interaction with the
server, as they allow the administrator to create abbreviations for some
frequently typed commands. For example, the following alias creates new
command d which is equivalent to DEFINE *:
alias d DEFINE "*";
dicod(1), RFC 2229.
Complete GNU Dico manual: run info dico or use
emacs(1) info mode to read it.
Online copies of GNU Dico documentation in various formats
can be found at:
http://www.gnu.org.ua/software/dico/manual
Report bugs to <bug-dico@gnu.org.ua>.
Copyright © 2008-2018 Sergey Poznyakoff
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it. There is NO
WARRANTY, to the extent permitted by law.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |