|
NAMELocale::Recode - Object-Oriented Portable Charset ConversionSYNOPSISuse Locale::Recode; $cd = Locale::Recode->new (from => 'UTF-8', to => 'ISO-8859-1'); die $cd->getError if $cd->getError; $cd->recode ($text) or die $cd->getError; $mime_name = Locale::Recode->resolveAlias ('latin-1'); $supported = Locale::Recode->getSupported; $complete = Locale::Recode->getCharsets; DESCRIPTIONThis module provides routines that convert textual data from one codeset to another in a portable way. The module has been started before Encode(3) was written. It's main purpose today is to provide charset conversion even when Encode(3) is not available on the system. It should also work for older Perl versions without Unicode support.Internally Locale::Recode(3) will use Encode(3) whenever possible, to allow for a faster conversion and for a wider range of supported charsets, and will only fall back to the Perl implementation when Encode(3) is not available or does not support a particular charset that Locale::Recode(3) does. Locale::Recode(3) is part of libintl-perl, and it's main purpose is actually to implement a portable charset conversion framework for the message translation facilities described in Locale::TextDomain(3). CONSTRUCTORThe constructor "new()" requires two named arguments:
The constructor will never fail. In case of an error, the object's internal state is set to bad and it will refuse to do any conversions. You can inquire the reason for the failure with the method getError(). OBJECT METHODSThe following object methods are available.
CLASS METHODSThe object provides some additional class methods:
SUPPORTED CHARSETSThe range of supported charsets is system-dependent. The following somewhat special charsets are always available:
Locale::Recode(3) has native support for a plethora of other encodings, most of them 8 bit encodings that are fast to decode, including most encodings used on popular micros like the ISO-8859-* series of encodings, most Windows-* encodings (also known as CP*), Macintosh, Atari, etc. NAMES AND ALIASESEach charset resp. encoding is available internally under a unique name. Whenever the information was available, the preferred MIME name (see <http://www.iana.org/assignments/character-sets/>) was chosen as the internal name.Alias handling is quite strict. The module does not make wild guesses at what you mean ("What's the meaning of the acronym JIS" is a valid alias for "7bit-jis" in Encode(3) ....) but aims at providing common aliases only. The same applies to so-called aliases that are really mistakes, like "utf8" for UTF-8. The module knows all aliases that are listed with the IANA character set registry (<http://www.iana.org/assignments/character-sets/>), plus those known to libiconv version 1.8, and a bunch of additional ones. CONVERSION TABLESThe conversion tables have either been taken from official sources like the IANA or the Unicode Consortium, from Bruno Haible's libiconv, or from the sources of the GNU libc and the regression tests for libintl-perl will check for conformance here. For some encodings this data differs from Encode(3)'s data which would cause these tests to fail. In these cases, the module will not invoke the Encode(3) methods, but will fall back to the internal implementation for the sake of consistency.The few encodings that are affected are so simple that you will not experience any real performance penalty unless you convert large chunks of data. But the package is not really intended for such use anyway, and since Encode(3) is relatively new, I rather think that the differences are bugs in Encode which will be fixed soon. BUGSThe module should provide fall back conversions for other Unicode encoding schemes like UCS-2, UCS-4 (big- and little-endian).The pure Perl UTF-8 decoder will not always handle corrupt UTF-8 correctly, especially at the end and at the beginning of the string. This is not likely to be fixed, since the module's intention is not to be a consistency checker for UTF-8 data. AUTHORCopyright (C) 2002-2017 Guido Flohr <http://www.guido-flohr.net/> (<mailto:guido.flohr@cantanea.com>), all rights reserved. See the source code for details!code for details!SEE ALSOEncode(3), iconv(3), iconv(1), recode(1), perl(1)POD ERRORSHey! The above document had some coding errors, which are explained below:
Visit the GSP FreeBSD Man Page Interface. |