|
TITLESearch::VectorSpace - a very basic vector-space search engine SYNOPSIS use Search::VectorSpace;
my @docs = ...;
my $engine = Search::VectorSpace->new( docs => \@docs, threshold => .04);
$engine->build_index();
while ( my $query = <> ) {
my %results = $engine->search( $query );
print join "\n", keys %results;
}
DESCRIPTIONThis module takes a list of documents (in English) and builds a simple in-memory search engine using a vector space model. Documents are stored as PDL objects, and after the initial indexing phase, the search should be very fast. This implementation applies a rudimentary stop list to filter out very common words, and uses a cosine measure to calculate document similarity. All documents above a user-configurable similarity threshold are returned. METHODS
AUTHORMaciej Ceglowski <maciej@ceglowski.com> This program is free software, released under the GNU public license
|