|
NAMEXML::Pastor - Generate Perl classes with XML bindings starting from a W3C XSD SchemaSYNOPSISuse XML::Pastor; my $pastor = XML::Pastor->new(); # Generate MULTIPLE modules, one module for each class, and put them under destination. $pastor->generate( mode =>'offline', style => 'multiple', schema=>'/some/path/to/schema.xsd', class_prefix=>'MyApp::Data::', destination=>'/tmp/lib/perl/', ); # Generate a SINGLE module which contains all the classes and put it under destination. # Note that the schema may be read from a URL too. $pastor->generate( mode =>'offline', style => 'single', schema=>'http://some/url/to/schema.xsd', class_prefix=>'MyApp::Data::', module => 'Module', destination=>'/tmp/lib/perl/', ); # Generate classes in MEMORY, and EVALUATE the generated code on the fly. # (Run Time code generation) $pastor->generate( mode =>'eval', schema=>'/some/path/to/schema.xsd', class_prefix=>'MyApp::Data::' ); # Same thing, with a maximum of DEBUG output on STDERR $pastor->generate( mode =>'eval', schema=>'/some/path/to/schema.xsd', class_prefix=>'MyApp::Data::', verbose = 9 ); And somewhere in an other place of the code ... (Assuming a global XML element 'country' existed in you schema and hence been generated by Pastor). # This is the preferred way of getting at the class names of XML elements and types (since v1,0,3) my $class = MyApp::Data::Pastor::Meta->Model->xml_item_class('country'); # Or, with a namespace URI, in case there are multiple namespaces in the model. $class = MyApp::Data::Pastor::Meta->Model->xml_item_class('country', 'http://www.example.com/country'); my $country = $class->from_xml_file('/some/path/to/country.xml'); # retrieve from a file $country = $class->from_xml_url('http://some/url/to/country.xml'); # or from a URL $country = $class->from_xml_fh($fh); # or from a file handle $country = $class->from_xml_dom($dom); # or from DOM (a XML::LibXML::Node or XML::LibXML::Document) $country = $class->from_xml($resource); # or from any of the above. Handy if you don't know the resource.' # or from an XML string (Note the alternate way of using the class name directly) $country = MyApp::Data::country->from_xml_string(<<'EOF'); <?xml version="1.0" encoding="UTF-8"?> <country xmlns="http://www.example.com/country" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.example.com/country" code="FR"> <name>France</name> <population date="2000-01-01" figure="60000000"/> <currency code="EUR" name="Euro"/> <city code="AVA"> <name>Ambrières-les-Vallées</name> </city> <city code="BCX"> <name>Beire-le-Châtel</name> </city> <city code="LYO"> <name>Lyon</name> </city> <city code="NCE"> <name>Nice</name> </city> <city code="PAR"> <name>Paris</name> </city> </country> EOF # or if you don't know if you have a file, URL, FH, or string $country = MyApp::Data::country->from_xml('http://some/url/to/country.xml'); # Now you can manipulate your country object. print $country->name; # prints "France" print $country->currency->_code; # prints "EUR" print $country->city->[0]->name; # prints "Ambrières-les-Vallées" print $country->city->name; # prints the same thing, i.e. "Ambrières-les-Vallées" # Note the ABSENCE of array indexing. # You don't have to worry about multiplicity! # Let's make some changes $country->_code('fr'); # Change the 'code' attribute. Notice the underscore prefix on the accessor. $country->code('fr'); # Same thing, but risky in case of attribute name collision with a child element name. It's there for backward compatibility. $country->name('FRANCE'); #Let's access the cities as a hash keyed on city code. my $city_h = $country->city->hash('_code'); # This will hash the node array on the 'code' attribute my $city_h = $country->city->hash(sub {shift->_code(); }); # This will do the same thing with a CODE reference. print $city_h->{'NCE'}->name; # prints "Nice". # Let's add a city my $class=$country->xml_field_class('city'); my $city = $class->new(); $city->_code('MRS'); $city->name('Marseille'); push @{$country->city}, $city; print $country->city->[5]->name; # prints "Marseille" # Time to validate our XML $country->xml_validate(); # This one will DIE on failure if ($country->is_xml_valid()) { # This one will not die. print "ok\n"; }else { print "Validation error : $@\n"; # Note that $@ contains the error message } # Time to write the the object back to XML $country->to_xml_file('some/path/to/country.xml'); # To a file $country->to_xml_url('http://some/url/to/country.xml'); # To a URL $country->to_xml_fh($fh); # To a FILE HANDLE $country->to_xml($resource); # To any of the above. Handy if we don't know ahead of time.' my $dom=$country->to_xml_dom(); # To a DOM Node (XML::LibXML::Node) my $dom=$country->to_xml_dom_document(); # To a DOM Document (XML::LibXML::Document) my $xml=$country->to_xml_string(); # To a string my $frag=$country->to_xml_fragment(); # Same thing without the <?xml version="1.0?> part By the way, for those who are interesed in the data structure, here is a sample DUMP of what '$country' might look like. However, don't count on anything but attribute and element names. Anything else may change. You have been warned. print Dumper($country); # actually with Sortkeys(1); # ---- Prints the following DUMP $VAR1 = bless( { '._nodeName_' => 'country', '_code' => bless( { 'value' => 'FR' }, 'XML::Pastor::Builtin::string' ), 'city' => bless( [ bless( { '._nodeName_' => 'city', '_code' => bless( { 'value' => 'AVA' }, 'XML::Pastor::Test::Type::Code' ), 'name' => bless( { 'value' => "Ambri\x{e8}res-les-Vall\x{e9}es" }, 'XML::Pastor::Builtin::string' ) }, 'XML::Pastor::Test::Type::City' ), bless( { '._nodeName_' => 'city', '_code' => bless( { 'value' => 'BCX' }, 'XML::Pastor::Test::Type::Code' ), 'name' => bless( { 'value' => "Beire-le-Ch\x{e2}tel" }, 'XML::Pastor::Builtin::string' ) }, 'XML::Pastor::Test::Type::City' ), bless( { '._nodeName_' => 'city', '_code' => bless( { 'value' => 'LYO' }, 'XML::Pastor::Test::Type::Code' ), 'name' => bless( { 'value' => 'Lyon' }, 'XML::Pastor::Builtin::string' ) }, 'XML::Pastor::Test::Type::City' ), bless( { '._nodeName_' => 'city', '_code' => bless( { 'value' => 'NCE' }, 'XML::Pastor::Test::Type::Code' ), 'name' => bless( { 'value' => 'Nice' }, 'XML::Pastor::Builtin::string' ) }, 'XML::Pastor::Test::Type::City' ), bless( { '._nodeName_' => 'city', '_code' => bless( { 'value' => 'PAR' }, 'XML::Pastor::Test::Type::Code' ), 'name' => bless( { 'value' => 'Paris' }, 'XML::Pastor::Builtin::string' ) }, 'XML::Pastor::Test::Type::City' ) ], 'XML::Pastor::NodeArray' ), 'currency' => bless( { '._nodeName_' => 'currency', '_code' => bless( { 'value' => 'EUR' }, 'XML::Pastor::Builtin::string' ), '_name' => bless( { 'value' => 'Euro' }, 'XML::Pastor::Builtin::string' ) }, 'XML::Pastor::Test::Type::Country_currency' ), 'name' => bless( { 'value' => 'France' }, 'XML::Pastor::Builtin::string' ), 'population' => bless( { '._nodeName_' => 'population', '_date' => bless( { 'value' => '2000-01-01' }, 'XML::Pastor::Builtin::date' ), '_figure' => bless( { 'value' => '60000000' }, 'XML::Pastor::Builtin::nonNegativeInteger' ) }, 'XML::Pastor::Test::Type::Population' ) }, 'XML::Pastor::Test::country' ); DESCRIPTIONJava had CASTOR, and now Perl has XML::Pastor!If you know what Castor does in the Java world, then XML::Pastor should be familiar to you. If you have a W3C XSD schema, you can generate Perl classes with roundtrip XML bindings. Whereas Castor is limited to offline code generation, XML::Pastor is able to generate Perl classes either offline or at run-time starting from a W3C XSD Schema. The generated classes correspond to the global elements, complex and simple type declarations in the schema. The generated classes have full XML binding, meaning objects belonging to them can be read from and written to XML. Accessor methods for attributes and child elements will be generated automatically. Furthermore it is possible to validate the objects of generated classes against the original schema although the schema is typically no longer accessible. XML::Pastor defines just one method, 'generate()', but the classes it generates define many methods which may be found in the documentation of XML::Pastor::ComplexType and XML::Pastor::SimpleType from which all generated classes descend. In 'offline' mode, it is possible to generate a single module with all the generated clasess or multiple modules one for each class. The typical use of the offline mode is during a 'make' process, where you have a set of XSD schemas and you generate your modules to be later installed by the 'make install'. This is very similar to Java Castor's behaviour. This way your XSD schemas don't have to be accessible during run-time and you don't have a performance penalty. Perl philosophy dictates however, that There Is More Than One Way To Do It. In 'eval' (run-time) mode, the XSD schema is processed at run-time giving much more flexibility to the user. This added flexibility has a price on the other hand, namely a performance penalty and the fact that the XSD schema needs to be accessible at run-time. Note that the performance penalty applies only to the code genereration (pastorize) phase; the generated classes perform the same as if they were generated offline. There is a command line utility called pastorize that can help generating classes offline. See the documentation of pastorize for more details on that. SCOPE AND WARNINGXML::Pastor is quite good for the so called 'data xml', that is, XML without mixed markup. It is NOT suitable for parsing and manipulating a markup language such as XHTML for example. 'Mixed markup' means that an element can contain both textual data and child elements miexed together. XML::Pastor does not support that.XML::Pastor is NOT a recommended way of treating HUGE XML documents. The exact definition of HUGE varies. It usually means paging into virtual memory. If you find yourself doing that, you should know that you might be better of with XML::Twig which lets you selectively parse chunks of a tree. Or better yet, just do SAX processing. Note that things are not that bad with XML::Pastor => The memory used by XML::Pastor is not that much more than that of XML::Simple or a DOM for the same document. METHODSnew() (CONSTRUCTOR)The new() constructor method instantiates a new XML::Pastor object.my $pastor = XML::Pastor->new(); This is currently unnecessary as the only method ('generate') is a class method. However, it is higly recommended to use it and call 'generate' on an object (rather than the class) as in the future, 'generate' may no longer be a class method. version (CLASS METHOD)Returns the current VERSION of XML::Pastor;generate(%options)Currently a CLASS METHOD, but may change to be an OBJECT METHOD in the future. It works when called on an OBJECT too at this time.This method is the heart of the module. It will accept a schema file name or URL as input (among some other parameters) and proceed to code generation. This method will parse the schema(s) given by the "schema" parameter and then proceed to code generation. The generated code will be written to disk (mode=>"offline") or evaluated at run-time (mode=>"eval") depending on the value of the "mode" parameter. In "offline" mode, the generated classes will either all be put in one "single" big code block, or in "multiple" module files (one for each class) depending on the "style" parameter. Again in "offline" mode, the generated modules will be written to disk under the directory prefix given by the "destination" parameter. In any case, the names of the generated classes will be prefixed by the string given by the "class_prefix" parameter. It is possible to indicate common ancestors for generated classes via the "complex_isa" and "simple_isa" parameters. This metod expects the following parameters:
::Pastor::Meta CLASSSuppose you use XML::Pastor for code generation with a class prefix of MyApp::Data. Then, XML::Pastor will also generate a class that enables you to access meta information about the generated code under 'MyApp::Data::Pastor::Meta'.Currently, the only information you can access is the 'Model' that was used to generate code. 'Model' is class data that references to an entire schema model object (of type XML::Schema::Model). With the help of the generated 'meta' class, you can access the Model which will in turn enable you to call methods such as ' xml_item_class()' which helps you determine the generated Perl class of a given global element or type in the schema. Example: $pastor->generate( mode =>'eval', schema=>'/some/path/to/schema.xsd', class_prefix=>'MyApp::Data::' ); # Access the schema model my $model = MyApp::Data::Pastor::Meta->Model; # Note that this is $class_prefix . 'Pastor::Meta' # Get the class name for element 'country' my $class = $model->xml_item_class('country'); # OR $class = $model->xml_item_class('country', 'http://www.example.com/country'); # with a namespace URI # Now read the object from a file my $country = $class->from_xml_file('/some/path/to/country.xml'); # retrieve from a file SCHEMA SUPPORTThe version 1.0 of W3C XSD schema (2001) is supported almost in full, albeit with some exceptions (see "BUGS & CAVEATS").SUPPORTEDSuch things as complex and simple types, global elements, groups, attributes, and attribute groups are supported. Type declarations can either be global or done locally.Complex type derivation by extension and simple type derivation by restriction is supported. All the basic W3C builtin types are supported. Unions and lists are supported. Most of the restriction facets for simple types are supported (length, minLength, maxLength, pattern, enumeration, minInclusive, maxInclusive, minExclusive, maxExclusive, totalDigits, fractionDigits). Schema inclusion (include) and redefinition (redefine) are supported, allthough for 'redefine' not much testing was done. Schema 'import' is now supported (since version 0.6.3). ComplexTypes with simpleContent (simple-type elements eventually with attributes) are supported (since v0.6.0). PARTIALLY SUPPORTEDNamespaces are quite well supported now (since version 0.6.3). Multiple namespaces are OK.However, local name collisions with multiple namespaces will yield unpredicted results. That is, if, for example, you have two child elements with the same local name but with different namespaces, the result is unpredictable. NOT SUPPORTEDElements with 'mixed' content are NOT supported.Substitution groups are NOT supported. 'any' and 'anyAttribute' are NOT supported. HOW IT WORKSThe source code of the "generate()" method looks like this:sub generate { my $self = shift; my $parser =XML::Pastor::Schema::Parser->new(); my $model = $parser->parse(@_); $model->resolve(@_); my $generator = XML::Pastor::Generator->new(); my $result = $generator->generate(@_, model=>$model); return $result; } At code generation time, XML::Pastor will first parse the schema(s) into a schema model (XML::Pastor::Schema::Model). The model contains all the schema information in perl data structures. All the global elements, types, attributes, groups, and attribute groups are put into this model. Then, the model is 'resolved', i.e. the references ('ref') are resolved, class names are determined and so on. Then, comes the code generation stage where your classes are generated according to the given options. In offline mode, this phase will write out the generated code onto modules on disk. Otherwise it can also 'eval' the generated code for you. The generated classes will contain class data named 'XmlSchemaType' (thanks to Class::Data::Inheritable), which will contain all the schema model information that corresponds to this type. For a complex type, it will contain information about child elements and attributes. For a simple type it will contain the restriction facets that may exist and so on. For complex types, the generated classes will also have accessors for the attributes and child elements of that type (thanks to Class::Accessor). However, you can also use direct hash access as the objects are just blessed hash references. The fields in the has correspond to attributes and child elements of the complex type. You can also store additional non-XML data in these objects. Such fields are silently ignored during validation and XML serialization. This way, your objects can have state information that is not stored in XML. Just make sure the names of these fields do not coincide with XML attributes and child elements though. The inheritance of classes are also managed by XML::Pastor for you. Complex types that are derived by extension will automatically be a descendant of the base class. Same applies to the simple types derived by restriction. Global elements will always be a descendant of some type, which may sometimes be implicitely defined. Global elements will have an added ancestor XML::Pastor::Element and will also contain an extra class data accessor "XmlSchemaElement" which will contain schema information about the model. This class data is currently used mainly to get at the name of the element when an object of this class is stored in XML (as ComplexTypes don't have an element name). Then you use the generated modules. If the generation was offline, you actually need a 'use' statement. If it was an 'eval', you can start using your generated classes immediately. At this time, you can call many methods on the generated classes that enable you to create, retrieve and save an object from/to XML. There are also methods that enable you to validate these objects against schema information. Furthermore, you can call the accessors that were automagically created for you on class generation for getting at the fields of complex objects. Since all the schema information is saved as class data, the schema is no longer needed at run-time. NAMING CONVENTIONS FOR GENERATED CLASSESThe generated classes will all be prefixed by the string given by the "class_prefix" parameter. The rest of this section assumes that "class_prefix" is "MyApp::Data".Classes that correspond to global elements will keep the name of the element. For example, if there is an element called 'country' in the schema, the corresponding clas will have the name 'MyApp::Data::country'. Note that no change in case occurs. Classes that correspond to global complex and simple types will be put under the 'Type' subtree. For example, if there is a complex type called 'City' in the XSD schema, the corresponding class will be called 'MyApp::Data::Type::City'. Note that no change in case occurs. Implicit types (that is, types that are defined inline in the schema) will have auto-generated names within the 'Type' subtree. For example, if the 'population' element within 'City' is defined by an implicit type, its corresponding class will be 'MyApp::Data::Type::City_population'. Sometimes implicit types need more to disambiguate their names. In that case, an auto-incremented sequence is used to generate the class names. In any case, do not count on the names of the classes for implicit types. The naming convention for those may change. In other words, do not reference these classes by their names in your program. You have been warned. SUGGESTED NAMING CONVENTIONS FOR XML TYPES, ELEMENTS AND ATTRIBUTES IN W3C SCHEMASSometimes you will be forced to use a W3C schema defined by someone else. In that case, you will not have a choice for the names of types, elements, and attributes defined in the schema.But most often, you will be the one who defines the W3C schema itself. So you will have full power over the names within. As mentioned earlier, XML::Pastor will generate accesor methods for the child elements and attributes of each class. The attribute names will be prefixed by an underscore in the hash. Attribute accessors will have an underscore prefix, too. However, accessor aliases will be generated without the underscore prefix for those attributes whose names don't clash with child element names. ' Since there exist some utility methods defined under XML::Pastor::ComplexType and XML::Pastor::SimpleType that are the ancestors of all the generated classes from your schema there is a risk of name collisions. Below is a list of suggestions that will ensure that there are no name collisions within your schema and with the defined methods.
You are free to name global groups and attribute groups to your liking. BUGS & CAVEATSThere no known bugs at this time, but this doesn't mean there are aren't any. Note that, although some testing was done prior to releasing the module, this should still be considered alpha code. So use it at your own risk.There are known limitations however:
Note that there may be other bugs or limitations that the author is not aware of. AUTHORAyhan Ulusoy <dev(at)ulusoy(dot)name>COPYRIGHTCopyright (C) 2006-2008 Ayhan Ulusoy. All Rights Reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. DISCLAIMERBECAUSE THIS SOFTWARE IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE SOFTWARE, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE SOFTWARE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE SOFTWARE IS WITH YOU. SHOULD THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR, OR CORRECTION.IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE SOFTWARE AS PERMITTED BY THE ABOVE LICENCE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE SOFTWARE TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. SEE ALSOSee also pastorize, XML::Pastor::ComplexType, XML::Pastor::SimpleTypeIf you are curious about the implementation, see also XML::Pastor::Schema::Parser, XML::Pastor::Schema::Model, XML::Pastor::Generator. POD ERRORSHey! The above document had some coding errors, which are explained below:
Visit the GSP FreeBSD Man Page Interface. |