Encode::ISO2022 - ISO/IEC 2022 character encoding scheme
package FooEncoding;
use base qw(Encode::ISO2022);
__PACKAGE__->Define(
Name => 'foo-encoding',
CCS => [ {...CCS one...}, {...CCS two...}, ....]
);
This module provides a character encoding scheme (CES) switching a set of
multiple coded character sets (CCS).
A class method Define() may take following arguments.
- Alias => REGEX
- The regular expression representing alias of this encoding, if any.
- Name => STRING
- The name of this encoding as Encode::Encoding object. Mandatory.
- CCS => [ FEATURE, FEATURE, ...]
- List of features defining CCSs used by this encoding. Mandatory. Each item
is a hash reference containing following items.
- bytes => NUMBER
- Number of bytes to represent each character. Default is 1.
- cl => BOOLEAN
- If true value is set, this CCS includes map to/from code points between
0/0 and 1/15. There should be one CCS with this flag to reset broken
designation.
- dec_only => BOOLEAN
- If true value is set, this CCS will be used only for decoding.
- encoding => STRING | ENCODING
- Encode::Encoding object used as CCS, or its name. Mandatory.
Encodings used for CCS must provide "raw"
conversion. Namely, they must be stateless and fixed-length conversion
over 94^n or 96^n code tables. Encode::ISO2022::CCS lists available
CCSs.
- g => STRING
- g_init => STRING
- Working set this CCS may be designated to: 'g0',
'g1', 'g2' or
'g3'.
If "g_init" is set, this CCS
will be designated at beginning of coversion implicitly, and at end of
conversion explicitly.
If "g" or
"g_init" is set and neither of
"ls" nor
"ss" is set, this CCS will be invoked
when it is designated.
If neither of "g",
"g_init",
"ls" nor
"ss" is set, this CCS is invoked
always.
- g_seq => STRING
- Escape sequence to designate this CCS, if it can be designated
explicitly.
- gr => BOOLEAN
- If true value is set, this CCS will be invoked to GR using 7-bit
conversion table.
- ls => STRING
- ss => STRING
- Escape sequence or control character to invoke this CCS, if it should be
invoked explicitly.
If "ls" is set, this CCS
will be invoked by locking-shift. If
"ss" is set, this CCS will be invoked
by single-shift.
- range => STRING
- Possible range of encoded bytes. General value is
'\x21-\x7E', '\x20-\x7F',
'\xA1-\xFE' or
'\xA0-\xFF'. This is required for multibyte CCSs
to detect broken multibyte sequences.
- LineInit => BOOLEAN
- If it is true, designation and invokation states will be initialized at
beginning of lines.
- SubChar => STRING
- Unicode string to be used for substitution character.
To know more about use of this module, the source of
Encode::ISO2022JP2 may be an example.
This module implements small subset of the features defined by ISO/IEC 2022.
Each encoding recognizes only several predefined designation and invokation
functions. It can handle limited number of coded character sets. Variable
length multibyte coded character sets aren't supported. And so on.
ISO/IEC 2022 Information technology - Character code structure and extension
techniques.
Encode, Encode::ISO2022::CCS.
Hatuka*nezumi - IKEDA Soji, <nezumi@cpan.org>
Copyright (C) 2013 by Hatuka*nezumi - IKEDA Soji
This program is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.