|
NAMEHTML::Parser::Simple::Reporter - A sub-class of HTML::Parser::SimpleSynopsis#!/usr/bin/env perl use strict; use warnings; use HTML::Parser::Simple::Reporter; # ------------------------- # Method 1: my($p) = HTML::Parser::Simple::Reporter -> new(input_file => 'data/s.1.html'); my($s) = $p -> traverse_file; print "$_\n" for @$s; # Method 2: my($p) = HTML::Parser::Simple::Reporter -> new; my($s) = $p -> traverse_file(input_file => 'data/s.1.html'); print "$_\n" for @$s; See scripts/traverse.file.pl. Description"HTML::Parser::Simple::Reporter" is a pure Perl module.It is a sub-class of HTML::Parser::Simple. Specifically, this module overrides the method "traverse($node)" in HTML::Parse::Simple, to demonstrate a different way of formatting the output. It parses HTML V 4 files, and generates a tree of nodes, with 1 node per HTML tag. The data associated with each node is documented in the "FAQ" in HTML::Parse::Simple. See also HTML::Parser::Simple and HTML::Parser::Simple::Attributes. DistributionsThis module is available as a Unix-style distro (*.tgz).See http://savage.net.au/Perl-modules.html for details. See http://savage.net.au/Perl-modules/html/installing-a-module.html for help on unpacking and installing. Constructor and initializationnew(...) returns an object of type "HTML::Parser::Simple::Reporter".This is the class contructor. Usage: "HTML::Parser::Simple::Reporter -> new()". This method takes a hashref of options. Call "new()" as "new({option_1 => value_1, option_2 => value_2, ...})". Available options (each one of which is also a method):
But since this class is a sub-class of HTML::Parser::Simple, it share all the options to "new()" documented in that class: "Constructor and initialization" in HTML::Parser::Simple. MethodsThis module is a sub-class of HTML::Parser::Simple, and inherits all its methods.Further, it overrides the "traverse($node)" in HTML::Parser::Simple method. traverse($node, $output, $depth)Returns $output as an arrayref of strings.Traverses the tree built by calling "parse($html)" in HTML::Parser::Simple. Parameters:
Lastly note that this method ignores the root of the tree, and hence ignores the DOCTYPE which is stored as an attribute of the root. traverse_file($input_file_name)Returns an arrayref of formatted text generated from the nodes in the tree built by calling "parse($html)" in HTML::Parse::Simple.Traverses the given file, or the file named in "new(input_file => $name)", or the file named in "input_file($name)". Basically it does this (recalling that this class sub-classes HTML::Parser::Simple): # Read file and store contents in $html. $self -> parse($html); my($output) = []; $self -> traverse($self -> root, $output, 0); return $output; However, since this class has overridden the "traverse($node)" in HTML::Parse::Simple method, the output is not written anywhere, but rather is stored in an arrayref, and returned as the result of this method. Note: The parameter passed in to "traverse_file($input_file_name)", takes precedence over the input_file parameter passed in to "new()", and over the internal value set with "input_file($in_file_name)". Lastly, the parameter passed in to "traverse_file($input_file_name)" is used to update the internal value set with the input_file parameter passed in to "new()", or set with a call to "input_file($in_file_name)". See the "Synopsis" for sample code. See also scripts/traverse.file.pl. FAQSee "FAQ" in HTML::Parse::Simple.Author"HTML::Parser::Simple" was written by Ron Savage <ron@savage.net.au> in 2009.Home page: <http://savage.net.au/index.html>. CopyrightAustralian copyright (c) 2009 Ron Savage.All Programs of mine are 'OSI Certified Open Source Software'; you can redistribute them and/or modify them under the terms of The Artistic License, a copy of which is available at: http://www.opensource.org/licenses/index.html
Visit the GSP FreeBSD Man Page Interface. |