GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
EXCLUDE_ROBOTS(1) User Contributed Perl Documentation EXCLUDE_ROBOTS(1)

exclude_robot.pl - a simple filter script to filter robots out of logfiles

    exclude_robot.pl
        -url <robot exclusions URL>
        [ -exclusions_file <exclusions file> ]
        <httpd log file>
    
    OR

    cat <httpd log file> | exclude_robot.pl -url <robot exclusions URL>

This script filters HTTP log files to exclude entries that correspond to know webbots, spiders, and other undesirables. The script requires a URL as a command line option which should point to a text file containing a linebreak separated list of lowercase strings to match on for bots. This is based on the format used by ABC (<http://www.abc.org.uk/exclusionss/exclude.html>).

The script filters httpd logfile entries either from a filename specified on the command line, or from STDIN. It outputs filtered entries to STDOUT.

-url <robot exclusions URL>
Specify the URL of file to grab which contains the list of agents to exclude. The option is REQUIRED.
-exclusions_file <exclusions file>
Specify a file to save excluded entries from the logfile. This option is OPTIONAL.

Ave Wrigley <Ave.Wrigley@itn.co.uk>

Copyright (c) 2001 Ave Wrigley. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Hey! The above document had some coding errors, which are explained below:
Around line 100:
You forgot a '=back' before '=head1'
2001-05-25 perl v5.32.1

Search for    or go to Top of page |  Section 1 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.