|
|
| |
Locale::XGettext(3) |
User Contributed Perl Documentation |
Locale::XGettext(3) |
Locale::XGettext - Extract Strings To PO Files
use base 'Locale::XGettext';
Locale::XGettext is the base class for various string extractors. These
string extractors can be used as standalone programs on the command-line or as
a module as a part of other software.
See <https://github.com/gflohr/Locale-XGettext> for an
overall picture of the software.
This section describes the usage of extractors based on this library. See
"SUBCLASSING" and the sections following it for the API
documentation!
xgettext-LANG [OPTIONS] [INPUTFILE]...
LANG will be replaced by an identifier for the language
that a specific extractor was written for, for example
"xgettext-txt" for plain text files or "xgettext-tt2"
for templates for the Template Toolkit version 2 (see Template).
By default, string extractors based on this module extract strings
from one or more INPUTFILES and write the output to a file
"messages.po" if any strings had been found.
The command line options are mostly compatible to xgettext from GNU
Gettext
<https://www.gnu.org/software/gettext/manual/html_node/xgettext-Invocation.html>.
- INPUTFILE...
- All non-option arguments are interpreted as input files containing strings
to be extracted. If the input file is "-", standard input is
read.
- -f FILE
- --files-from=FILE
- Read the names of the input files from FILE instead of getting them
from the command line.
Note! Unlike xgettext from GNU Gettext, extractors
based on Locale::XGettext accept this option multiple times, so
that you can read the list of input files from multiple files.
- -D DIRECTORY
- --directory=DIRECTORY
- Add DIRECTORY to the list of directories. Source files are searched
relative to this list of directories. The resulting .po file will be
written relative to the current directory, though.
- -d NAME
- --default-domain=NAME
- Use NAME.po for output (instead of messages.po).
- -o FILE
- --output=FILE
- Write output to specified FILE (instead of
NAME.po or messages.po).
- -p DIR
- --output-dir=DIR
- Output files will be placed in directory DIR.
If the output file is - or /dev/stdout, the
output is written to standard output.
- --from-code=NAME
- Specifies the encoding of the input files. This option is needed only if
some untranslated message strings or their corresponding comments contain
non-ASCII characters.
By default the input files are assumed to be in ASCII.
Note! Some extractors have a fixed input set, UTF-8
most of the times.
- -j
- --join-existing
- Join messages with existing files. This is a shortcut for adding the
output file to the list of input files. The output file is read, and then
all messages from other input files are added.
For obvious reasons, you cannot use this option if output is
written to standard output.
- -x FILE.po
- --exclude-file=FILE.po
- PO entries that are present in FILE.po are not extracted.
- -c TAG
- --add-comments=TAG
- Place comment blocks starting with TAG in the output if they
precede a keyword line.
- -c
- --add-comments
- Place all comment blocks that precede a keyword line in the output.
- -a
- --extract-all
- Extract all strings, not just the ones marked with keywords.
Not all extractors support this option!
- -k WORD
- --keyword=WORD
- Use WORD as an additional keyword.
Not all extractors support this option!
- -k
- --keyword
- Do not use default keywords! If you define your own keywords, you use
usually give the option '--keyword' first without an argument to reset the
keyword list to empty, and then you give a '--keyword' option for everyt
keyword wanted.
Not all extractors support this option!
- --flag=WORD:ARG:FLAG
- Original explanation from GNU gettext:
Specifies additional flags for strings occurring as part
of the argth argument of the function word. The possible flags
are the possible format string indicators, such as ‘c-format’,
and their negations, such as ‘no-c-format’, possibly prefixed
with ‘pass-’.
The meaning of --flag=function:arg:lang-format is that in
language lang, the specified function expects as argth
argument a format string. (For those of you familiar with GCC function
attributes, --flag=function:arg:c-format is roughly equivalent to the
declaration ‘__attribute__ ((__format__ (__printf__, arg,
...)))’ attached to function in a C source file.) For example,
if you use the ‘error’ function from GNU libc, you can specify
its behaviour through --flag=error:3:c-format. The effect of this
specification is that xgettext will mark as format strings all gettext
invocations that occur as argth argument of function. This is
useful when such strings contain no format string directives: together with
the checks done by ‘msgfmt -c’ it will ensure that translators
cannot accidentally use format string directives that would lead to a crash
at runtime.
The meaning of --flag=function:arg:pass-lang-format
is that in language lang, if the function call occurs in a
position that must yield a format string, then its argth argument
must yield a format string of the same type as well. (If you know GCC
function attributes, the --flag=function:arg:pass-c-format option is
roughly equivalent to the declaration ‘__attribute__ ((__format_arg__
(arg)))’ attached to function in a C source file.) For
example, if you use the ‘_’ shortcut for the gettext function,
you should use --flag=_:1:pass-c-format. The effect of this specification is
that xgettext will propagate a format string requirement for a
_("string") call to its first argument, the literal
"string", and thus mark it as a format string. This is useful when
such strings contain no format string directives: together with the checks
done by ‘msgfmt -c’ it will ensure that translators cannot
accidentally use format string directives that would lead to a crash at
runtime.
Note that Locale::XGettext ignores the prefix pass-
and therefore most extractors based on Locale::XGettext will also
ignore it.
Individual extractors may define more language-specific
options.
- --force-po
- Write PO file even if empty. Normally, empty PO files are not written, and
existing output files are not overwritten if they would be empty.
- --no-location
- Do not write '#: filename:line' lines into the output PO files.
- -n
- --add-location
- Generate '#: filename:line' lines in the output PO files. This is the
default.
- -s
- --sort-output
- Sort output entries alphanumerically.
- -F
- --sort-by-file
- Sort output entries by source file location.
- --omit-header
- Do not write header with meta information. The meta information is
normally included as the "translation" for the empty string.
If you want to hava a translation for an empty string you
should also consider using message contexts.
- --copyright-holder=STRING
- Set the copyright holder to STRING in the output PO file.
- --foreign-user
- Omit FSF copyright in output for foreign user.
- --package-name=PACKAGE
- Set package name in output
- --package-version=VERSION
- Set package version in output.
- --msgid-bugs-address=EMAIL@ADDRESS
- Set report address for msgid bugs.
- -m[STRING]
- --msgstr-prefix[=STRING]
- Use STRING or "" as prefix for msgstr values.
- -M[STRING]
- --msgstr-suffix[=STRING]
- Use STRING or "" as suffix for msgstr values.
- -h
- --help
- Display short help and exit.
- -V
- --version
- Output version information and exit.
Writing a complete extractor script in Perl with Locale::XGettext is as
simple as:
#! /usr/bin/env perl
use Locale::Messages qw(setlocale LC_MESSAGES);
use Locale::TextDomain qw(YOURTEXTDOMAIN);
use Locale::XGettext::YOURSUBCLASS;
Locale::Messages::setlocale(LC_MESSAGES, "");
Locale::XGettext::YOURSUBCLASS->newFromArgv(\@ARGV)->run->output;
Writing the extractor class is also trivial:
package Locale::XGettext::YOURSUBCLASS;
use base 'Locale::XGettext';
sub readFile {
my ($self, $filename) = @_;
foreach my $found (search_for_strings_in $filename) {
$self->addEntry({
msgid => $found->{string},
# More possible fields following, see
# addEntry() below!
}, $found->{possible_comment});
}
# The return value is actually ignored.
return $self;
}
All the heavy lifting happens in the method
readFile() that you have to implement yourself. All
other methods are optional.
See the section "METHODS" below for information on how
to additionally modify the behavior your extractor.
- new $OPTIONS, @FILES
- OPTIONS is a hash reference containing the above command-line
options but with every hyphen replaced by an underscore. You should
normally not use this constructor!
- newFromArgv $ARGV
- ARGV is a reference to a an array of command-line arguments that is
passed verbatim to Getopt::Long::GetOptionsFromArray. After
processing all options and arguments, the constructor
new() above is then invoked with the cooked
command-line arguments.
This is the constructor that you should normally use in custom
extractors that you write.
Locale::XGettext is an abstract base class. All public methods may be
overridden by subclassed extractors.
- readFile FILENAME
- You have to implement this method yourself. In it, read FILENAME,
extract all relevant entries, and call addEntry() for
each entry found.
The method is not invoked for filenames ending in
".po" or ".pot"! For those files,
readPO() is invoked instead.
This method is the only one that you have to implement!
- addEntry ENTRY[, COMMENT]
- You should invoke this method for every entry found.
COMMENT is an optional comment that you may have
extracted along with the message. Note that
addEntry() checks whether this comment should make
it into the output. Therefore, just pass any comment that you have found
preceding the keyword.
ENTRY should be a reference to a hash with these
possible keys:
- msgid
- The entry's message id.
- msgid_plural
- A possible plural form.
- msgctxt
- A possible message context.
- reference
- A source reference in the form "FILENAME: LINENO".
- flags
- Set a flag for this entry, for example "perl-brace-format" or
"no-perl-brace-format". You can comma-separate multiple
flags.
- keyword
- The keyword that triggered the entry. If you set this property and the
keyword definition contained an automatic comment, the comment will be
added. You can try this out like this:
xgettext-my.pl --keyword=greet:1,'"Hello, world!"'
If you set keyword to "greet", the comment
"Hello, world" will be added. Note that the "double
quotes" are part of the command-line argument!
Likewise, if "--flag" was specified on the
command-line or the extractor ships with default flags, entries matching
the flag definition will automatically have this flag.
You can try this out with:
xgettext-my.pl --keyword="greet:1" --flag=greet:1:hello-format
Now all PO entries for the keyword "greet" will have
the flag "hello-format"
- fuzzy
- True if the entry is fuzzy. There is no reason to use this in string
extractors because they typically product .pot files without
translations.
- automatic
- Sets an automatic comment, not recommended. Rather set the keyword (see
above) and let Locale::XGettext set the comment as
appropriate.
Instead of a hash you can currently also pass a Locale::PO
object. This may no longer be supported in the future. Do not use!
- keywords
- Return a hash reference with all keyword definitions as
Locale::XGettext::Util::Keyword objects.
- keywordOptionStrings
- Return a reference to an array with all keyword definitions as option
strings suitable for the command-line option "--keyword".
- flags
- Return an array reference with all flag definitions as
Locale::XGettext::Util::Flag objects.
- flagOptionStrings
- Return a reference to an array with all flag definitions as option strings
suitable for the command-line option "--flag".
- options
- Get all command-line options as a hash reference.
- option OPTION
- Get the value for command line option OPTION.
- setOption OPTION, VALUE
- Set the value for command line option OPTION to VALUE.
- languageSpecificOptions
- The default representation returns nothing.
Your own implementation can return an reference to an array of
arrays, each of them containing one option specification consisting of
four items:
- The option specification for Getopt::Long(3pm), for example
"f|filename=s" for an option expexting a mandatory string
argument.
- The name of the option. This is what gets passed to option() above.
It should generally be the long option name with hyphens converted to
underscores.
- The option description for the usage information, for example "-f,
--files=STRING" for options taking arguments or something like "
--verbose" for long-only options. This is printed in the left column,
when you invoke your extractor with "--help".
- The description of this option. This is printed in the right column, when
you invoke your extractor with "--help".
- printLanguageSpecificOptions
- Prints all language-specific options to standard output, calls
languageSpecificOptions() internally. This is used for the output
for the option "--help".
- fileInformation
- Returns nothing by default. You can return a string describing the
expected input format, when invoked with "--help".
- versionInformation
- Returns nothing by default. You can return a string that is printed, when
invoked with "--version".
- bugTrackingAddress
- Returns nothing by default. You can return a string describing the bug
tracking address, when invoked with "--help".
- canExtractAll
- Returns false by default. Return a truthy value if your extractor supports
the option "--extract-all".
- canKeywords
- Returns true by default. Return a false value if your extractor does not
support the option "--keyword".
- canFlags
- Returns true by default. Return a false value if your extractor does not
support the option "--flag".
- needInputFiles
- Returns true by default. Return a false value if your extractor does not
support input from files. In this case you should implement
readFromNonFiles().
- programName
- Return the name of the program for usage and help information. Defaults to
just $0 but you can return another value
here.
- run
- Runs the extractor once. The default implementation scans all input
sources for translatable strings and collects them.
- output
- Print the output as a PO file to the specified output location.
- extractFromNonFiles
- This method is invoked after all input files have been processed. The
default implementation does nothing. You may use the method for extracting
strings from additional sources like a database.
- resolveFilename FILENAME
- Given an input filename FILENAME the method returns the absolute
location of the file. The default implementation honors the option
"-D, --directory".
- defaultKeywords
- Returns a reference to an emtpy array.
Subclasses may return a reference to an array with default
keyword definitions for the specific language. The default keywords
(actually just a subset for it) for the language C would look like this
(expressed in JSON):
[
"gettext:1",
"ngettext:1,2",
"pgettext:1c,2",
"npgettext:1c,2,3"
]
See above the description of the command-line option
"--keyword" for more information about the meaning of these
strings.
- defaultFlags
- Returns a reference to an emtpy array.
Subclasses may return a reference to an array with default
flag specifications for the specific language. An example may look like
this (expressed in JSON):
[
"gettextx:1:perl-brace-format",
"ngettextx:1:perl-brace-format",
"ngettextx:2:perl-brace-format",
]
We assume that "gettextx()" and
"gettextx() are keywords for the language in question. The
above default flag definition would mean that in all invocations of the
function "gettextx()", the 1st argument would get the
flag "perl-brace-format". In all invocations of
"ngettextx()", the 1st and 2nd argument would get the
flag "perl-brace-format".
You can prefix the format with "no-" which tells the
GNU gettext tools that the particular never uses that format.
You can additionally prefix the format with "pass-"
but this is ignored by Locale::XGettext. If you want to implemnt the GNU
xgettext behavior for the "pass-" prefix, you have to
implement it yourself in your extractor.
- recodeEntry ENTRY
- Gets invoked for every PO entry but after it has been promoted to a
Locale::PO(3pm) object. The implementation of this
method is likely to be changed in the future.
Do not use!
- readPO FILENAME
- Reads FILENAME as .po or .pot file. There is no reason why you
should override or invoke this method.
- po
- Returns a list of PO entries represented by hash references. Do not use or
override this method!
- printLanguageSpecificUsage
- Prints the help for language-specific options. Override it, if you are not
happy with the formatting.
Copyright (C) 2016-2017 Guido Flohr <guido.flohr@cantanea.com>, all rights
reserved.
Getopt::Long, xgettext(1), perl
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |