AI::Categorizer::Learner - Abstract Machine Learner Class
use AI::Categorizer::Learner::NaiveBayes; # Or other subclass
# Here $k is an AI::Categorizer::KnowledgeSet object
my $nb = new AI::Categorizer::Learner::NaiveBayes(...parameters...);
$nb->train(knowledge_set => $k);
$nb->save_state('filename');
... time passes ...
$nb = AI::Categorizer::Learner::NaiveBayes->restore_state('filename');
my $c = new AI::Categorizer::Collection::Files( path => ... );
while (my $document = $c->next) {
my $hypothesis = $nb->categorize($document);
print "Best assigned category: ", $hypothesis->best_category, "\n";
print "All assigned categories: ", join(', ', $hypothesis->categories), "\n";
}
The "AI::Categorizer::Learner" class is an
abstract class that will never actually be directly used in your code.
Instead, you will use a subclass like
"AI::Categorizer::Learner::NaiveBayes" which
implements an actual machine learning algorithm.
The general description of the Learner interface is documented
here.
- new()
- Creates a new Learner and returns it. Accepts the following
parameters:
- knowledge_set
- A Knowledge Set that will be used by default during the
"train()" method.
- verbose
- If true, the Learner will display some diagnostic output while training
and categorizing documents.
- train()
- train(knowledge_set => $k)
- Trains the categorizer. This prepares it for later use in categorizing
documents. The "knowledge_set" parameter
must provide an object of the class
"AI::Categorizer::KnowledgeSet" (or a
subclass thereof), populated with lots of documents and categories. See
AI::Categorizer::KnowledgeSet for the details of how to create such an
object. If you provided a
"knowledge_set" parameter to
"new()", specifying one here will
override it.
- categorize($document)
- Returns an "AI::Categorizer::Hypothesis"
object representing the categorizer's "best guess" about which
categories the given document should be assigned to. See
AI::Categorizer::Hypothesis for more details on how to use this
object.
- categorize_collection(collection => $collection)
- Categorizes every document in a collection and returns an Experiment
object representing the results. Note that the Experiment does not contain
knowledge of the assigned categories for every document, only a
statistical summary of the results.
- knowledge_set()
- Gets/sets the internal "knowledge_set"
member. Note that since the knowledge set may be enormous, some Learners
may throw away their knowledge set after training or after restoring state
from a file.
- $learner->save_state($path)
- Saves the Learner for later use. This method is inherited from
"AI::Categorizer::Storable".
- $class->restore_state($path)
- Returns a Learner saved in a file with
"save_state()". This method is inherited
from "AI::Categorizer::Storable".
Ken Williams, ken@mathforum.org
Copyright 2000-2003 Ken Williams. All rights reserved.
This library is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.