|
NAMESearch::QueryParser - parses a query string into a data structure suitable for external search enginesSYNOPSISmy $qp = new Search::QueryParser; my $s = '+mandatoryWord -excludedWord +field:word "exact phrase"'; my $query = $qp->parse($s) or die "Error in query : " . $qp->err; $someIndexer->search($query); # query with comparison operators and implicit plus (second arg is true) $query = $qp->parse("txt~'^foo.*' date>='01.01.2001' date<='02.02.2002'", 1); # boolean operators (example below is equivalent to "+a +(b c) -d") $query = $qp->parse("a AND (b OR c) AND NOT d"); # subset of rows $query = $qp->parse("Id#123,444,555,666 AND (b OR c)"); DESCRIPTIONThis module parses a query string into a data structure to be handled by external search engines. For examples of such engines, see File::Tabular and Search::Indexer.The query string can contain simple terms, "exact phrases", field names and comparison operators, '+/-' prefixes, parentheses, and boolean connectors. The parser can be parameterized by regular expressions for specific notions of "term", "field name" or "operator" ; see the new method. The parser has no support for lemmatization or other term transformations : these should be done externally, before passing the query data structure to the search engine. The data structure resulting from a parsed query is a tree of terms and operators, as described below in the parse method. The interpretation of the structure is up to the external search engine that will receive the parsed query ; the present module does not make any assumption about what it means to be "equal" or to "contain" a term. QUERY STRINGThe query string is decomposed into "items", where each item has an optional sign prefix, an optional field name and comparison operator, and a mandatory value.Sign prefixPrefix '+' means that the item is mandatory. Prefix '-' means that the item must be excluded. No prefix means that the item will be searched for, but is not mandatory.As far as the result set is concerned, "+a +b c" is strictly equivalent to "+a +b" : the search engine will return documents containing both terms 'a' and 'b', and possibly also term 'c'. However, if the search engine also returns relevance scores, query "+a +b c" might give a better score to documents containing also term 'c'. See also section "Boolean connectors" below, which is another way to combine items into a query. Field name and comparison operatorInternally, each query item has a field name and comparison operator; if not written explicitly in the query, these take default values '' (empty field name) and ':' (colon operator).Operators have a left operand (the field name) and a right operand (the value to be compared with); for example, "foo:bar" means "search documents containing term 'bar' in field 'foo'", whereas "foo=bar" means "search documents where field 'foo' has exact value 'bar'". Here is the list of admitted operators with their intended meaning :
Operators ":", "~", "=~", "!~" and "#" admit an empty left operand (so the field name will be ''). Search engines will usually interpret this as "any field" or "the whole data record". ValueA value (right operand to a comparison operator) can be
Boolean connectorsQueries can contain boolean connectors 'AND', 'OR', 'NOT' (or their equivalent in some other languages). This is mere syntactic sugar for the '+' and '-' prefixes : "a AND b" is translated into "+a +b"; "a OR b" is translated into "(a b)"; "NOT a" is translated into "-a". "+a OR b" does not make sense, but it is translated into "(a b)", under the assumption that the user understands "OR" better than a '+' prefix. "-a OR b" does not make sense either, but has no meaningful approximation, so it is rejected.Combinations of AND/OR clauses must be surrounded by parentheses, i.e. "(a AND b) OR c" or "a AND (b OR c)" are allowed, but "a AND b OR c" is not. METHODS
In case of a parsing error, "parse" returns "undef"; method err can be called to get an explanatory message.
AUTHORLaurent Dami, <laurent.dami AT etat ge ch>COPYRIGHT AND LICENSECopyright (C) 2005, 2007 by Laurent Dami.This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Visit the GSP FreeBSD Man Page Interface. |