W3C::LogValidator - The W3C Log Validator - Quality-focused Web
    Server log processing engine
Checks quality/validity of most popular content on a Web
  server
"W3C::LogValidator" is the main
    module for the W3C Log Validator, a combination of Web Server log analysis
    and statistics tool and Web Content quality checker.
The "W3C::LogValidator" can
    batch-process a number of documents through a number of quality focus
    checks, such as HTML or CSS validation, or checking for broken links. It can
    take a number of different inputs, ranging from a simple list of URIs to log
    files from various Web servers. And since it orders the result depending on
    the number of times a document appears in the file or logs, it is, in
    practice, a useful way to spot the most popular documents that need
  work.
the perl script logprocess.pl, bundled in the W3C::LogValidator
    distribution, is a simple way to use the features of
    "W3C::LogValidator". Developers can also
    use "W3C::LogValidator" can be used as a
    perl module to build applications.
The homepage for the Log Validator is at:
    http://www.w3.org/QA/Tools/LogValidator/
The simple way to use is to edit the sample configuration file
    (samples/logprocess.conf) and to run the bundled logprocess.pl script with
    this configuration file, a la:
    logprocess.pl -f /path/to/logprocess.conf
The basic task of the
    "W3C::LogValidator" module is to parse a
    configuration file and process relevant logs, passed through a configuration
    file argument:
    use W3C::LogValidator;
    my $logprocessor = W3C::LogValidator->new("sample.conf");
    $logprocessor->process;
Alternatively, it will use default a default config and try to
    process Web server logs in "well known locations":
    my $logprocessor = W3C::LogValidator->new;
    $logprocessor->process;
  - $processor = W3C::LogValidator->new
 
  - Constructs a new "W3C::LogValidator"
      processor. You might pass a configuration file name, as well as a hash of
      attribute-value pairs as parameters to the constructor.
    
e.g. for mail output:
    
      %conf = (
    "UseOutputModule" => "W3C::LogValidator::Output::Mail",
    "ServerAdmin" => 'webmaster@example.com',
    "verbose" => "3"
    );
  $processor = W3C::LogValidator->new("path/to/config.conf", \%conf);
    
    Or e.g. for HTML output:
    
      %conf = (
    "UseOutputModule" => "W3C::LogValidator::Output::HTML",
    "OutputTo" => 'path/to/file.html',
    "verbose" => "0"
    );
  $processor = W3C::LogValidator->new("path/to/config.conf", \%conf);
    
    If given the path to a configuration file,
        new() will call the W3C::LogValidator::Config
        module to get its configuration variables. Otherwise, a default set of
        values is used.
   
  - $processor->process =item $processor->find_remote_addr
 
  - Given a log record and the type of the log (common log format, flat list
      of URIs, etc), extracts the remote host or ip
    
Do-it-all method: Read configuration file (if any), parse log
        files, run them through processing modules, send result to output
        module.
   
  - $processor->config_module
 
  - Creates a configuration hash for a specific module, adding module-specific
      configuration variables, overriding if necessary
 
  - $processor->use_modules
 
  - Run the data parsed off the log files through the various processing
      (validation) modules specified by UseValidationModule in the
      configuration.
 
  - $processor->read_logfiles
 
  - Loops through and parses all log files specified in the configuration
 
  - $processor->read_logfile('path/to.file')
 
  - Extracts URIs and number of hits from a given log file, and feeds it to
      the processor's URI/Hits table
 
  - $processor->find_uri
 
  - Given a log record and the type of the log (common log format, flat list
      of URIs, etc), extracts the URI
 
  - $processor->remove_duplicates
 
  - Given a URI, removes "directory index" suffixes such as
      index.html, etc so that http://foobar/ and http://foobar/index.html be
      counted as one resource
 
  - $processor->add_uri
 
  - Add a URI to the processor's URI/Hits table
 
  - $processor->sorted_uris
 
  - Returns the list of URIs in the processor's table, sorted by popularity
      (hits)
 
  - $processor->no_cgi
 
  - Tests whether a given URI contains a CGI query string
 
  - $processor->hit
 
  - Returns the number of hits for a given URI. Basically a "public"
      method accessing $hits{$uri};
 
Public bug-tracking interface at
  http://www.w3.org/Bugs/Public/
Olivier Thereaux <ot@w3.org> for The World Wide Web
    Consortium
Up-to-date information on the Log Validator at:
 http://www.w3.org/QA/Tools/LogValidator/
Several articles have been written within the W3C Quality
    Assurance Interest Group on the topic of improving the quality of Web sites,
    notably by using a step-by-step approach and relying upon the Log Validator
    to help find the areas to fix in priority.
  - My Web site is standard! And
    yours?
 
  - Available at http://www.w3.org/QA/2002/04/Web-Quality
 
  - Web Standards Switch
 
  - or how to improve your Web site easily.
    
Available in several languages at:
        http://www.w3.org/QA/2003/03/web-kit
   
  - Making your website valid:
    a step by step guide.
 
  - Available at http://www.w3.org/QA/2002/09/Step-by-step