|
|
| |
OpenFTS(3) |
User Contributed Perl Documentation |
OpenFTS(3) |
Search::OpenFTS::Search - Provides functions for searching
my $fts=Search::OpenFTS->new( DBI );
my $fts=Search::OpenFTS->new(
DBI,
relfunc=>name_of_rel_function_in_psql,
txttid=>NAME_OF_TXTID,
prefix=>PREFIX
);
Example of relfunc:
relfunc=>q[ rank( '{0.1, 0.2, 0.4, 1.0}',
$TSVECTOR, $QUERY, 0 )]
first option is weights (see tsearch documentation), Last argument
defines how to normalize weight of document:
0 - no normalization (default)
1 - normalized by log(length of document)
2 - normalized by length of document
- get_sql( \@ARRAY_WORD );
- get_sql( $STRING );
- get_sql( \$STRING );
- get_sql( *, %opt );
- %opt - as in the constructor (see above), plus a
key dict_opt = > {}, transmitted to dictionaries
Returns parts of SQL:
($out, $condition,
$order)
Here is how they can be combined in an SQL statement:
SELECT
$opt{txttid}$out
FROM
table
WHERE
$condition
$order;
- search( SEARCH_AS_IN_GET_SQL )
- search( SEARCH_AS_IN_GET_SQL, %opt )
- Returns the reference to the list of identifiers sorted by relevance
- get_headline(
-
query=>$query,
src=>[$FH|$txt|$reftxt],
maxlen=>$maxlen,
maxread=>$maxread,
otag=>$opentag,
ctag=>$closetag,
replace_ignore_headline=>$str_to_replace,
dict_opt=>{})
Returns fragment of the document with search terms hilighted.
maxread bytes reads from the document to generates headline with length
maxlen.
otag,ctag denote strings used for hilighting, for example,
<b>, </b>.
replace_ignore_headline - string used to replace html markups,
space by default.
- get_headline2(
-
query=>$query,
src=>[$FH|$txt|$reftxt],
min_words=>$min_words,
max_words=>$max_words,
otag=>$opentag,
ctag=>$closetag,)
Another method for getting headline. This method should be a
little bit slower but more accurate. The following parameters are
recognized:
min_words - minimal number of words in headline;
max_words - maximal number of words in headline;
shortword - maximal length of word that will be rejected at
the end of headline;
sentence_length - the expected length of sentence to cut
headline at the end of sentence;
nonword_tokens - a list of token ids (separated by spaces)
which are not counted as words;
leave_tokens - a list of token ids that are not considered as
words but that will appear in the output;
complex_tokens - a list of token ids that will be split on
smaller tokens;
noend_tokens - token ids which are not desirable at the end of
headline;
endpunct_tokens - token ids which represents end-of-sentence
punctuation signs.
met_query - if set to a scalar ref stores true value if query
has been found in processed text.
The OpenFTS Primer ( see doc/ subdirectory )
The Crash-course to OpenFTS ( in examples/ subdirectory )
perldoc Search::OpenFTS::Index
perldoc Search::OpenFTS::Parser
perldoc Search::OpenFTS::Dict::PorterEng
perldoc Search::OpenFTS::Dict::Snowball
perldoc Search::OpenFTS::Dict::UnknownDict
perldoc Search::OpenFTS::Morph::ISpell
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |