|
|
| |
TEXTMAIL(1) |
User Contributed Perl Documentation |
TEXTMAIL(1) |
textmail - mail filter to replace MS Word/HTML attachments with plain text
usage: textmail [options]
options:
-h - Print the help message then exit
-m - Print the manpage then exit
-w - Print the manpage in html format then exit
-r - Print the manpage in nroff format then exit
-M - Output in mailbox format (mboxrd)
-T - Output in raw mail format (for smtp)
-W - Don't replace MS Word attachments with text
-E - Don't replace MS Excel attachments with csv
-H - Don't replace HTML attachments with text
-R - Don't replace RTF attachments with text
-P - Don't replace PDF attachments with text
-U - Don't translate winmail.dat attachments
-L - Don't reduce appledouble attachments
-I - Don't delete image attachments
-A - Don't delete audio attachments
-V - Don't delete video attachments
-X - Don't delete MS Windows executable attachments
-B - Don't recode text that was base64-encoded
-S - Don't replace spaces in filenames with underscores
-Z - Do translate signed content (discards signatures)
-O - Delete all application/octet-stream attachments
-! - Delete all application/* attachments
-D hdrs - Delete headers (list of header prefixes and filenames)
-K types - Keep attachments (list of mimetypes and filenames)
-f - On translation error, keep translation, not original
-? - Print paths of helper applications then exit
textmail filters a mail message or mbox, replacing MS Word, MS Excel,
HTML, RTF and PDF attachments with the plain text contained therein. By
default, the following attachments are also deleted: image, audio, video and
MS Windows executables. MS "winmail.dat"
attachments are replaced by any attachments contained therein which are then
replaced by text or deleted in the same fashion. Any of these actions can be
suppressed with the command line options. Mail headers can also be selectively
deleted.
This is useful for increasing the accessibility of mail messages
(by reducing their dependence on proprietary file formats), for dramatically
reducing their size (and the time it takes to download them and the time it
takes to read them), and for dramatically reducing the risk of mail-borne
viruses. Its intended use is as a preprocessor for mailing lists. This is
more friendly than a strict "No Attachments" policy.
- "-h"
- Print the help message then exit.
- "-m"
- Print the manpage then exit. This is equivalent to executing
"man textmail" but this works even when
the manpage isn't installed.
- "-w"
- Print the manpage in html format then exit. This lets you install the
manpage in html format with a command like:
mkdir -p /usr/local/share/doc/textmail/html &&
textmail -w > /usr/local/share/doc/textmail/html/textmail.1.html
- "-r"
- Print the manpage in nroff format then exit. This lets you install the
manpage with a command like:
textmail -r > /usr/local/share/man/man1/textmail.1
- "-M"
- This option causes the output to be in mboxrd format by adding a mailbox
"From" line at the top if there isn't
one already and ensures that there is a blank line at the bottom of the
output. It also performs mailbox quoting on any lines in the body that
look like mailbox "From" headers. Use
this when the output is to be stored directly in a mailbox file. It is not
necessary when textmail is being used as a mail filter by
procmail(1).
- "-T"
- This option causes the output to be in raw mail format by removing any
mailbox "From" line and by not
performing mailbox quoting. Use this when the output is to be sent
directly to an SMTP server. It is not necessary when textmail is
being used as a mail filter by
procmail(1).
- "-W"
- By default, textmail replaces MS Word attachments with inline plain
text attachments that contain just the plain text within the original
document. This option leaves MS Word attachments intact.
- "-E"
- By default, textmail replaces MS Excel attachments with CSV file
attachments that contain just the data within the original document. This
option leaves MS Excel attachments intact.
- "-H"
- By default, textmail replaces HTML attachments with inline plain
text attachments that contain just the text within the original document.
It also reduces text-versus-html alternative attachments to just the text
attachment. This option leaves HTML (and alternative) attachments
intact.
- "-R"
- By default, textmail replaces RTF attachments with inline plain
text attachments that contain just the plain text within the original
document. This option leaves RTF attachments intact.
- "-P"
- By default, textmail replaces PDF attachments with inline plain
text attachments that contain just the plain text within the original
document. This option leaves PDF attachments intact.
- "-U"
- By default, textmail replaces MS TNEF (i.e.
"winmail.dat") attachments with the
attachments contained therein which are then translated to text as normal.
This option leaves "winmail.dat"
attachments intact. This option, together with the
"-!" option will cause winmail.dat
attachments to be deleted rather than translated.
- "-L"
- By default, textmail replaces
"multipart/appledouble" attachments with
just the data fork attachment contained therein which is then translated
to text as normal. This option leaves appledouble attachments intact.
However, the data fork attachment will still be translated as normal
resulting in a probably inappropriate and possibly broken resource fork
attachment. Therefore, this option should probably only be used in
conjunction with other options that suppress the translation of the data
fork attachment.
- "-I"
- By default, textmail deletes image attachments. This option leaves
image attachments intact.
- "-A"
- By default, textmail deletes audio attachments. This option leaves
audio attachments intact.
- "-V"
- By default, textmail deletes video attachments. This option leaves
video attachments intact.
- "-X"
- By default, textmail deletes attachments containing MS Windows
executables. That means
"application/octet-stream" attachments
with the following filename extensions:
"com",
"exe",
"pif",
"dll",
"ocx",
"scr",
"vbs" and
"js". This option leaves MS Windows
executable attachments intact. To delete
"zip" files as well, you could use
either the "-O" option or the
"-!" option.
- "-B"
- By default, when text is encountered that is
"base64"-encoded, textmail will
recode it as either "7bit" or
"quoted-printable", whichever is
appropriate. This option suppresses this recoding. Note that if the text
is large enough and contains a high enough proportion of non-ASCII
characters, it will remain
"base64"-encoded to minimise space.
- "-S"
- When translating attachments, textmail replaces bad filename
characters such as space characters with the underscore character. This
option causes underscore characters to subsequently be converted into
space characters. In other words, you can use this option to preserve
space characters in attachment filenames (other bad filename characters
will then be converted to spaces as well).
- "-Z"
- By default, textmail will not translate
"multipart/signed" attachments. This
option causes "multipart/signed"
attachments to be replaced by the signed attachment contained therein,
discarding the signature control data. The no-longer-signed data is then
translated to text as normal. Note that
"multipart/encrypted" attachments are
never translated.
- "-O"
- Delete all "application/octet-stream"
attachments, not just MS Windows executables. Note that this overrides
"-X" but
"-K" overrides this.
- "-!"
- Delete all "application/*" attachments.
Note that this overrides "-X" but
"-K" overrides this. Also note that
translated documents are no longer
"application/*" attachments so they
aren't deleted unless their translation is suppressed with the appropriate
command line option.
- "-D" hdrs
- Delete particular headers. The hdrs argument is a comma separated
list of header name prefixes and/or the names of files containing header
name prefixes (blank lines, whitespace and shell style comments are
ignored). For example, "textmail -DX-"
deletes all headers whose names begin with
"X-".
- "-K" types
- By default, textmail deletes several types of non-text attachment.
The "-O" and
"-!" options delete even more. This
option specifies, by mimetype and/or filename extension, a list of
attachments not to delete. This overrides all deletions.
The types argument is a comma separated list of
mimetypes and/or filename extensions and/or the names of files
containing mimetypes and/or filename extensions (blank lines, whitespace
and shell style comments are ignored). Note that the elements are
interpreted as a complete mimetype, if they contain a slash character,
or as either the "*" in
"application/*" or as a filename
extension if they do not contain a slash character. For example,
"textmail -Wf!Kdoc" deletes all
"application/*" attachments except MS
Word documents.
- "-f"
- Whenever textmail is unable to translate any attachment into text,
it will leave the attachment intact. This happens when the requisite
translation software can't be found, when it runs but returns an error
code, and when it produces an empty file. It also happens when
"winmail.dat" attachments are corrupt.
This option causes the empty translation to take the place of the original
attachment. Only the name of the attachment is preserved. This is needed
to ensure plain text even in the face of an MS Word document that contains
no text (e.g. only images).
- "-?"
- Print the paths of all helper applications then exit.
A procmail(1) recipe that insists on pure text and
no "X-" headers (with output in mailbox
format):
:0 fw
| textmail -Mf!DX-
Do the same but to an existing mailbox file:
textmail -Mf!DX- < mailbox > mailbox-as-text
Delete all "application/*"
attachments except for PostScript and PDF (and don't translate PDF into
text):
textmail -!PKps,pdf
Delete all "application/*"
attachments except for zip files and gzipped tar files:
textmail -!Ktar.gz,zip
A procmail(1) recipe that just unpacks
winmail.dat attachments but doesn't translate the attachments contained
therein into text and doesn't delete windows executables (with output in
mailbox format):
:0 fw
| textmail -MWEHRPLIAVXS
MS Word and RTF documents are translated into plain text using
antiword(1) or
catdoc(1). If textmail can't find
antiword(1) or
catdoc(1), then MS Word and RTF attachments are
left intact. So make sure that antiword(1) or
catdoc(1) is installed and in the
$PATH.
MS Excel documents are translated into csv files using
xls2csv(1). If textmail can't find
xls2csv(1), then MS Excel attachments are left
intact. So make sure that xls2csv(1) is
installed and in the $PATH.
HTML documents are translated into plain text using
lynx(1). If textmail can't find
lynx(1), then HTML attachments are left intact.
So make sure that lynx(1) is installed and in
the $PATH.
PDF documents are translated into plain text using
pdftotext(1). If textmail can't find
pdftotext(1), then PDF attachments are left
intact. So make sure that pdftotext(1) is
installed and in the $PATH.
textmail also requires perl(1)
and pod2man(1) and
pod2html(1) (which come with
perl (1)) and
mktemp(1).
If textmail fails to create a temporary directory, or if it
is instructed to do nothing (i.e.
"-WEHRPULIAVX"), then it degenerates into
cat(1).
The latest version of xls2csv(1) at the time of
writing (i.e. catdoc-0.93.3) loses data.
If textmail is unable to create a temporary directory (in
"/tmp"), then it degenerates into
cat(1). Without a temporary directory, no
attachments will be translated or deleted no matter what options (even
"-f") were given to textmail. So
make sure that "/tmp" is writable. Also
make sure that mktemp(1) is available otherwise
an insecure temporary directory will be created.
procmail(1),
antiword(1),
catdoc(1), xls2csv(1),
lynx(1), pdftotext(1),
pod2man(1),
pod2html(1),
"http://raf.org/minimail/"
20070803 raf <raf@raf.org>
"http://raf.org/textmail/"
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |