|
|
| |
DETOXRC(5) |
FreeBSD File Formats Manual |
DETOXRC(5) |
detox allows for configuration of its sequences through config files. This
document describes how these files work.
When setting up a new set of rules, the safe and
wipeup filters must always be run after a translating
filter (or series thereof), such as the utf_8 or the
uncgi filters. Otherwise, the risk of introducing
illegal characters into the filename is introduced.
The format of this configuration file is C-like. It is based loosely off named's
configuration files. Each statement is semicolon terminated, and modifiers on
a particular statement are generally contained within braces.
sequence
“name” {...};
- Defines a sequence of filters to run a filename through. "name"
specifies how the user will refer to the particular sequence during
runtime. Quotes around the sequence name are generally optional, but
should be used if the sequence name does not start with a letter.
There is a special sequence, named "default", which
is the default sequence used by detox. This can be overridden through
the command line option -s or the environmental
variable DETOX_SEQUENCE .
Sequence names are case sensitive and unique throughout all
sequences; that is, if a system wide file defines
normal_seq and a user has a sequence with the same
name in their .detoxrc, the users'
normal_seq will take precedence.
iso8859_1
{filename
“/path/to/filename”;};
- This translates ISO 8859-1 (aka Latin-1) characters into lower ASCII
equivalents. The output is not necessarily safe, and should also be run
through the
safe filter.
Under normal circumstances, the filename syntax is not needed.
Detox looks in several locations for a file called
iso8859_1.tbl, which is a set of rules defining
how an ISO 8859-1 character should be translated.
In the event this table doesn't exist, you have two options.
You can download or create your own, and tell detox the location of it
using the filename syntax shown above, or you can let detox fall back on
its internal tables. The internal tables translate the same as the stock
translation tables.
You can chain together multiple iso8859_1 translations, as
long as the default value of all but the last one is set to nothing.
This is explained in
detox.tbl(5).
This filter is mutually exclusive with the
utf_8 filter.
utf_8
{filename
“/path/to/filename”;};
- This translates Unicode characters, encoded by the UTF-8 translation
method, into safe equivalents.
This operates in a manner similar to
iso8859_1 , except it looks for a translation
table called unicode.tbl.
The default internal translation for Unicode characters only
contains the lower 256 characters of Unicode, which is equivalent to the
set of Basic Latin and Latin-1 characters.
uncgi ;
- This translates CGI escaped strings into their ASCII equivalents. The
output of this is not necessarily safe, and could contain ISO 8859-1 chars
or potentially UTF-8 characters.
safe
{filename
“/path/to/filename”;};
- This could also be called "safe for UNIX-like operating
systems". It translates characters that are difficult to work with in
UNIX environments into characters that are not.
In earlier versions this filter was entirely internal.
Starting with 1.2.0, this filter is controlled by a translation table.
In the absence of the translation table, the previous code will be
employed for the translation. Also, prior to 1.2.0, the safe filter
removed leading dashes to prevent the hassle of dealing with a filename
in the format -filename. This functionality is
exclusively handled by the wipeup filter
now.
See the SAFE section for more
details on what this filter translates by default.
wipeup
{remove_trailing ;};
- This wipes up any excessive characters. For instance, multiple underscores
or dashes will be converted into a single underscore or dash. Any series
of dash and underscore (i.e. "_-_") will be converted into a
single dash.
The remove trailing option removes a dash or underscore
followed immediately by a period.
See the WIPEUP section for
more details on what this filter translates.
max_length
{length value;};
- This trims a file down to the length specified (or less). It is conscious
of extensions and attempts to preserve anything following the last period
in a filename.
For instance, given a max length of 12, and a filename of
"this_is_my_file.txt", the filter would output
"this_is_.txt".
lower ;
- This translates uppercase characters into lowercase characters.
- Any thing after a # on any line is ignored.
sequence default {
uncgi;
iso8859_1 {
filename "iso8859_1.tbl";
};
# utf_8 {
# filename "unicode.tbl";
# };
safe {
filename "safe.tbl";
};
wipeup {
remove_trailing;
};
# max_length {
# length 128;
# };
};
The following characters are translated by the stock
safe filter. They can be tuned by updating safe.tbl or
creating a copy of safe.tbl and updating your rc file.
Safe |
Original |
_and_ |
& |
_ |
space ` ! @ $ * \ | : ; " ' < > ? / |
- |
( ) [ ] { } |
The following characters are translated by the wipeup
filter.
Wipeup |
Original |
- |
-_ |
- |
_- |
- |
-- |
_ |
__ |
Any leading dashes are stripped to prevent programs from interpreting these
files as command line options.
Wipeup |
Original |
removed |
- _ # |
Wipeup |
Original |
. |
.- |
. |
-. |
. |
._ |
. |
_. |
detox was written by Doug Harple.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |