|
NAMEConfig::Wrest - Read and write Configuration data With References, Environment variables, Sections, and TemplatingSYNOPSISuse Config::Wrest; my $c = new Config::Wrest(); # Read configuration data from a string, or from a reference to a string my $vars; $vars = $c->deserialize($string); $vars = $c->deserialize(\$string); # Write configuration data as a string my $string = $c->serialize(\%vars); # ...write the data into a specific scalar $c->serialize(\%vars, \$string); # Convenience methods to interface with files $vars = $c->parse_file($filename); $c->write_file($filename, \%vars); DESCRIPTIONThis module allows you to read configuration data written in a human-readable and easily-editable text format and access it as a perl data structure. It also allows you to write configuration data from perl back to this format.The data format allows key/value pairs, comments, escaping of unprintable or problematic characters, sensible whitespace handling, support for Unicode data, nested sections, or blocks, of configuration data (analogous to hash- and array-references), and the optional preprocessing of each line through a templating engine. If you choose to use a templating engine then, depending on the engine you're using, you can interpolate other values into the data, interpolate environment variables, and perform other logic or transformations. The data format also allows you to use directives to alter the behaviour of the parser from inside the configuration file, to set variables, to include other files, and for other actions. Here's a brief example of some configuration data. Note the use of quotes, escape sequences, and nested blocks: Language = perl <imageinfo> width = 100 # This is an end-of-line comment height 100 alt_text " square red image, copyright %A9 2001 " <Nestedblock> colour red </> [Suffixes] .jpg .jpeg [/] </imageinfo> @include path/to/file.cfg [Days] Sunday Can%{2019}t 'Full Moon' <weekend> length 48h </> # and so on... This is a full-line comment [/] This parses to the perl data structure: { Language => 'perl', imageinfo => { width => '100', height => '100', alt_text => " square red image, copyright \xA9 2001 ", Nestedblock => { colour => 'red' }, Suffixes => [ '.jpg', '.jpeg' ], }, Days => [ 'Sunday', "Can\x{2019}t", # note the Unicode character in this string 'Full Moon', { 'length' => '48h' } ], # ...and of course, whatever data was read from the included file "path/to/file.cfg" } Of course, your configuration data may not need to use any of those special features, and might simply be key/value pairs: Basedir /usr/local/myprogram Debug 0 Database IFL1 This parses to the perl data structure: { Basedir => '/usr/local/myprogram', Debug => '0', Database => 'IFL1', } These data structures can be serialized back to a textual form using this module. For details of the data format see "DATA FORMAT" and "DIRECTIVES". Also see "CONSTRUCTOR OPTIONS" for options which affect the parsing of the data. All file input and output goes through File::Slurp::WithinPolicy. MODULE NAMEAlthough the "Wrest" in the module's name is an abbreviation for its main features, it also means "a key to tune a stringed instrument" or "active or moving power". (Collaborative International Dictionary of English) You can also think of it wresting your configuration data from human-readable form into perl.METHODS
CONSTRUCTOR OPTIONSThese are the options that can be supplied to the constructor, and some may meaningfully be modified by the @option directive - namely the UseQuotes, Escapes, Subs and TemplateBackend options. Some of these option are turned on by default.
DATA FORMATThe data is read line-by-line. Comments are stripped and blank lines are ignored. You can't have multiple elements (key/value pairs, values in a list block, block opening tags, block closing tags, or directives) on a single line - you may only have one such element per line. Both the newline and carriage return characters (\n and \r) are considered as line breaks, and hence configuration files can be read and written across platforms (see "UNICODE HANDLING").Data is stored in two ways: as key/value pairs, or as individual values when inside a "list block". Hash or list blocks may be nested inside other blocks to arbitrary depth. KEY VALUE PAIRSLines such as these are used at the top level of the configuration file, or inside "HASH BLOCKS". The line simply has a key and a value, separated by whitespace or an '=' sign:colour=red name = "Scott Tiger" Age 23 Address foo%40example.com The 'key' can consist of "\w" characters, "." and "-". VALUE can include anything but a '#' to the end of the line. See Escapes and UseQuotes in "CONSTRUCTOR OPTIONS". SINGLE VALUESLines such as these are used inside "LIST BLOCKS". The value is simply given:Thursday "Two Step" apple%{2019}s These may not begin with these characters: '[', '<', '(', '{', ':', '@', '%', '/' because they are the first thing in a line and such characters would be confused with actual tags and reserved characters. See Escapes and UseQuotes in "CONSTRUCTOR OPTIONS" if your value begins with any of these, or if you want to include whitespace. COMMENTSComments may be on a line by themselves:# Next line is for marketing... Whiteness = Whizzy Whiteness! or at the end of a line: Style=Loads of chrome # that's what marketing want Note that everything following a '#' character (in Unicode that's called a "NUMBER SIGN") is taken to be a comment, so if you want to have an actual '#' in your data you must have the Escapes option turned on (see "CONSTRUCTOR OPTIONS") e.g.: Colour %23FF9900 even if the '#' is in the middle of a quoted string: Foo "bar#baz" # a comment is equivalent to: Foo "bar HASH BLOCKSA block which contains "KEY VALUE PAIRS", or other blocks. They look like:<Blockname> colour red # contents go here </Blockname> For convenience you can omit the block's name in the closing tag, like this: <Anotherblock> Age 23 # contents go here </> The name of the block can consist of "\w" characters, "." and "-". LIST BLOCKSA block which contains a list of "SINGLE VALUES", or other blocks. They look like:[Instruments] bass guitar [/Instruments] and you can omit the name in the closing tag if you wish: # ... guitar [/] The name of the block can consist of "\w" characters, "." and "-". WHITESPACE RULESIn "KEY VALUE PAIRS" the '=' between the Name and Value is optional, but it can have whitespace before and/or after it. If there's no '=' you need whitespace to separate the Name and Value.Block opening and closing tags cannot have whitespace inside them. Lines may be indented by arbitrary whitespace. Trailing whitespace is stripped from values (but see the UseQuotes and Escapes entries in "CONSTRUCTOR OPTIONS"). ESCAPINGSometimes you want to specify data with characters that are unprintable, hard-to type or have special meaning to Config::Wrest. You can escape such characters using two forms. Firstly, the '%' symbol followed by two hex digits, e.g. %A9, for characters up to 255 decimal. Secondly you can write '%' followed by any hex number in braces, e.g. "%{201c}" to specify any character by its Unicode code point. See 'Escapes' under "CONSTRUCTOR OPTIONS".DIRECTIVESThe configuration file itself can contain lines which tell the parser how to behave. All directive lines begin with an '@'. For example you can turn on the URL-style escaping, you can set variables, and so on. These are recognized directives:
UNICODE HANDLINGThis section has been written from the point-of-view of perl 5.8, although the concepts translate to perl 5.6's slightly different Unicode handling.First it's important to differentiate between configuration data that is given to deserialize() as a string which contains wide characters (i.e. code point >255), and data which contains escape sequences for wide characters. Escape sequences can only occur in certain places, whereas actual wide characters can be used in key names, block names, directives and in values. This is because the parser uses regular expressions which use metacharacters such as "\w", and these can match against some wide characters. Although you can use wide characters in directives, it may make no sense to try to "@include" a filename which contains wide characters. Configuration data will generally be read to or written from a file at some stage. You should be aware that File::Slurp::WithinPolicy uses File::Slurp which reads files in byte-oriented fashion. If this is not what you want, e.g. if your config files contain multi-byte characters such as UTF8, then you should either read/write the file yourself using the appropriate layer in the arguments to open(), or use the Encode module to go between perl's Unicode-based strings and the required encoding (e.g. your configuration files may be stored on disk as ISO-8859-1, but you want it to be read into perl as the Unicode characters, not as a stream of bytes). Similarly, you may wish to use Encode or similar to turn a string into the correct encoding for your application to use. Unicode specifies a number of different characters that should be considered as line endings: not just u000A and u000D, but also u0085 and several others. However, to keep this module compatible with perl versions before 5.8 this module splits data into lines on the sequence "\x0D\x0A" or on the regular expression "/[\n\r]/", and does not split on any of the other characters given in the Unicode standard. If you want your configuration data to use any of the other line endings you must read the file yourself, change the desired line ending to "\n" and pass that string to deserialize(). Reverse the process when using serialize() and writing files. E.g. on an OS/390 machine a configuration file may be stored with "NEL" (i.e. "\x85") line endings which need to be changed when reading it on a Unix machine. This module has not been tested on EBCDIC platforms. READING DATAIf you try to deserialize configuration data that has the wrong syntax (e.g. mis-nested blocks, or too many closing tags) a fatal error will be raised.Unrecognized directives cause a warning, as will key/value lines appearing in a list block, or list items appearing in a hash block (see AllowEmptyValues in "CONSTRUCTOR OPTIONS"). You also get a warning if there were too few closing tags and the parse implicitly closed some for you. WRITING DATAThe data structure you want to serialize must be a hash reference. The values may be strings, arrayrefs or hashrefs, and so on recursively. Any bad reference types cause a fatal croak().You are only allowed to use a restricted set of characters as hash keys, i.e. the names of block elements and the key in key/value pairs of data. If your data structure has a hash key that could create bad config data a fatal error is thrown with croak(). Values in list blocks are also checked, and a fatal error is raised if the value would create bad config data. In general you will want to use the 'Escapes' option described above. This makes it hard to produce bad configuration files. If you want to dump out cyclic / self-referential data structures you'll need to set the 'WriteWithReferences' option, otherwise the deep recursion will be detected and the serialization will throw a fatal error. SEE ALSOparse_file(), write_file() and the '@include' directive load File::Slurp::WithinPolicy on demand to perform the file input/output operations. See perlunicode for more details on perl's Unicode handling, and Encode for character recoding. See Any::Template, and the relevant templating modules, if the 'Subs' option is true.Although this module can read and write data structures it is not intended as an all-purpose serialization system. For that see Storable. Unicode Newline Guidelines from http://www.unicode.org/versions/Unicode4.0.0/ch05.pdf#G10213 VERSION$Revision: 1.36 $ on $Date: 2006/08/22 14:09:50 $ by $Author: mattheww $AUTHORIF&L Software Engineers <cpan _at_ bbc _dot_ co _dot_ uk>COPYRIGHT(c) BBC 2006. This program is free software; you can redistribute it and/or modify it under the GNU GPL.See the file COPYING in this distribution, or http://www.gnu.org/licenses/gpl.txt
Visit the GSP FreeBSD Man Page Interface. |