GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
Net::OAI::Record::NamespaceFilter(3) User Contributed Perl Documentation Net::OAI::Record::NamespaceFilter(3)

Net::OAI::Record::NamespaceFilter - general filter class based on namespace URIs

 $plug = Net::OAI::Record::NamespaceFilter->new(); # Noop

 $multihandler = Net::OAI::Record::NamespaceFilter->new(
    'http://www.openarchives.org/OAI/2.0/oai_dc/' => 'Net::OAI::Record::OAI_DC',
    'http://www.openarchives.org/OAI/2.0/provenance' => 'MySAX::ProvenanceHandler'
   );

 $saxfilter = new SOME_SAX_Filter;
 ...
 $filter = Net::OAI::Record::NamespaceFilter->new(
    '*' => $saxfilter, # '*' for any namespace
   );

 $filter = Net::OAI::Record::NamespaceFilter->new(
   '*' => sub { my $x = ""; 
                return XML::SAX::Writer->new(Output => \$x);
              };
  );

It will forward any element belonging to a namespace from this list to the associated SAX filter and all of the element's children (regardless of their respective namespace) to the same one. It can be used either as a "metadataHandler" or "recordHandler".

This SAX filter takes a hashref "namespaces" as argument, with namespace URIs for keys ('*' for "any namespace") and the values are either

undef
Matching elements and their subelements are suppressed.

If the list of namespaces ist empty or "undefined" is connected to the filter, it effectively acts as a plug to Net::OAI::Harvester. This might come handy if you are planning to get to the raw result by other means, e.g. by tapping the user agent or accessing the result's xml() method:

 $plug = Net::OAI::Record::NamespaceFilter->new();
 $harvester = Net::OAI::Harvester->new( [
     baseURL => ...,
     ] );

 $tapped_by_ua = "";
 open ($TAP, ">", \$tapped_by_ua);
 $harvester->userAgent()->add_handler(response_data => sub { 
        my($response, $ua, $h, $data) = @_;
        print $TAP $data;
     });

 $list = $harvester->listRecords( 
    metadataPrefix  => 'a_strange_one',
    recordHandler => $plug,
  );

 print $tapped_by_ua; # complete OAI response
 print $list->xml();  # should be exactly the same
    

Comment: This is quite an efficient way of not processing the XML content of OAI records received.

a class name of a SAX filter
As usual for any record element of the OAI response a new instance is created.

  # end_document() of instances of MyWriter returns something meaningful...
  $consumer = Net::OAI::Record::NamespaceFilter->new('*'=> 'MyWriter');

  $filter = Net::OAI::Record::NamespaceFilter->new(
      '*' => $consumer
    );
 
  $list = $harvester->listAllRecords( 
     metadataPrefix  => 'oai_dc',
     recordHandler => $filter,
   );

  while( $r = $list->next() ) {
     next if $r->status() eq "deleted";
     $xmlstringref = $r->recorddata()->result('*');
     ...
  };
    

Note: The handlers are instantiated for each single OAI record in the response and will see one start_document() and end_document() event in any case (this behavior is different from that of handler class names directly specified as "metadataHandler" or "recordHandler" for a request: instances from those constructions will never see such events).

a code reference for an constructor
Must return a SAX filter ready to accept a new document.

The following example returns a string serialization for each single record:

 # end_document() events will return \$x
 $constructor = sub { my $x = ""; 
                      return XML::SAX::Writer->new(Output => \$x);
                    };
 $filter = Net::OAI::Record::NamespaceFilter->new(
      '*' => $constructor
   );
 
 $list = $harvester->listRecords( 
     metadataPrefix  => 'oai_dc',
     recordHandler => $filter,
  );

 while( $r = $list->next() ) {
     $xmlstringref = $r->recorddata()->result('*');
     ...
  };
    

Comment: This example shows an approach to insulate the "true contents" of individual response records without having to provide a SAX handler class of one's own (just the addidtional prerequisite of XML::SAX::Writer). But what you get is a serialized XML document which then has to be parsed for further processing ...

an already instantiated SAX filter
As usual in this case no "start_document()" and "end_document()" events are forwarded to the filter.

 open $fh, ">", $some_file;
 $builder = XML::SAX::Writer->new(Output => $fh);
 $builder->start_document();
 $rootEL = { Name => 'collection',
           LocalName => 'collection',
        NamespaceURI => "http://www.loc.gov/MARC21/slim",
              Prefix => "",
          Attributes => {}
              };
 $builder->start_element( $rootEL );

 # filter for OAI-Namespace in records: forward all
 $filter = Net::OAI::Record::NamespaceFilter->new(
      'http://www.loc.gov/MARC21/slim' => $builder);

 $list = $harvester->listRecords( 
     metadataPrefix  => 'a_strange_one',
     metadataHandler => $filter,
  );
 # handle resumption tokens if more than the first
 # chunk shall be stored into $fh ....

 $builder->end_element( $rootEL );
 $builder->end_document();
 close($fh);
 # ... process contents of $some_file
    

In this example calling the "result()" method for individual records in the response will probably not be of much use.

Caution: Depending on the namespaces specified, even a handlers which are freshly instantiated for each OAI record might be fed with more than one top-level XML element.

Creates a Handler suitable as recordHandler or metadataHandler. %namespaces has namespace URIs for keys and values according to the four types described as above.

If called with a namespace, it returns the result of the handler, i.e. what "end_document()" returned for the record in question. Otherwise it returns a hashref for all the results with the corresponding namespaces as keys.

Thomas Berger <ThB@gymel.com>
2016-01-24 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.