GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
APERTIUM(1) FreeBSD General Commands Manual APERTIUM(1)

apertium
machine translation application platform

apertium [-au] [-d datadir] [-f format] language-pair [infile [outfile]]

apertium is the application that most people will be using as it simplifies the use of apertium/lt-toolbox tools for machine translation purposes.

This tool tries to ease the use of lt-toolbox (which contains all the lexical processing modules and tools) and apertium (which contains the rest of the engine) by providing a unique front-end to the end-user.

The different modules behind the apertium machine translation architecture are in order:

de-formatter
Separates the text to be translated from the format information.
morphological-analyser
Tokenizes the text in surface forms.
part-of-speech tagger
Chooses one surface forms among homographs.
lexical transfer module
Reads each source-language lexical form and delivers a corresponding target-language lexical form.
structural transfer module
Detects fixed-length patterns of lexical forms (chunks or phrases) needing special processing due to grammatical divergences between the two languages and performs the corresponding transformations.
morphological generator
Delivers a target-language surface form for each target-language lexical form, by suitably inflecting it.
post-generator
Performs orthographical operations such as contractions and apostrophations.
re-formatter
Restores the format information encapsulated by the de-formatter into the translated text and removes the encapsulation sequences used to protect certain characters in the source text.

datadir
The directory holding the linguistic data. By default it will use the expected installation path.
language-pair
The language pair: LANG1LANG2 (for instance “es-ca” or “ca-es”).
format
Specifies the format of the input and output files which can have these values:
(default value) Input and output files are in text format.
Input and output files are in “html” format. This “html” is the one accepted by the vast majority of web browsers.
Input and output files are in “html” format, but preserving native encoding characters rather than using HTML text entities.
Input and output files are in “rtf” format. The accepted “rtf” is the one generated by Microsoft WordPad and Microsoft Office up to and including Office 97.
Disable marking of unknown words with the ‘*’ character.
Enable header-detection (only used in some language pairs; will lead to stray ‘’ characters in pairs that don't support it).
Enable marking of disambiguated words with the ‘=’ character.

These are the two files that can be used with this command:
memory.tmx
use a translation memory to recycle translations
direction
translation direction using the translation memory, by default “direction” is used instead
lists the available translation directions and exits direction typically, LANG1LANG2, but see modes.xml in language data
infile
Input file (stdin by default).
outfile
Output file (stdout by default).

apertium-tagger(1), lt-comp(1), lt-expand(1), lt-proc(1)

Copyright © 2005, 2006 Universitat d'Alacant / Universidad de Alicante. This is free software. You may redistribute copies of it under the terms of the GNU General Public License.

Many... lurking in the dark and waiting for you!
March 8, 2006 Apertium

Search for    or go to Top of page |  Section 1 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.