Pod::Abstract::Path - Search for POD nodes matching a path within a document
tree.
/head1(1)/head2 # All head2 elements under
# the 2nd head1 element
//item # All items anywhere
//item[@label =~ {^\*$}] # All items with '*' labels.
//head2[/hilight] # All head2 elements containing
# "hilight" elements
# Top level head1s containing head2s that have headings matching
# "NAME", and also have at least one list somewhere in their
# contents.
/head1[/head2[@heading =~ {NAME}]][//over]
# Top level headings having the same title as the following heading.
/head1[@heading = >>@heading]
# Top level headings containing at least one subheading with the same
# name.
/head1[@heading = ./head2@heading]
Pod::Abstract::Path is a path selection syntax that allows fast and easy
traversal of Pod::Abstract documents. While it has a simple syntax, there is
significant complexity in the queries that you can create.
Not all of the designed features have yet been implemented, but it
is currently quite useful, and all of the filters in
"paf" make use of Pod Paths.
- /
- Selects children of the left hand side.
- //
- Selects all descendants of the left hand side.
- .
- Selects the current node - this is a NOP that can be used in
expressions.
- ..
- Selects the parrent node. If there are multiple nodes selected, all of
their parents will be included.
- ^
- Selects the root node of the tree for the current node. This allows you to
escape from a nested expression. Note that this is the ROOT node, not the
node that you started from.
If you want to evaluate an expression from a node as though it
were the root node, the easiest ways are to detach or dup it - otherwise
the root operator will find the original root node.
- name, #cut, :text, :verbatim, :paragraph
- Any element name, or symbolic type name, will restrict the selection to
only elements matching that type. e.g,
""//:paragraph"" will select
all descendants, anywhere, but then restrict that set to only
":paragraph" type nodes.
Names together separated by spaces will match all of those
names - e.g: "//head1 over" will match
all lists and all head1s.
- &, | (union and intersection)
- Union will take expressions on either side, and return all nodes that are
members of either set. Intersection returns nodes that are members of BOTH
sets. These can be used to extend expressions, and within [ expressions ]
where a path is supported (left side of a match, left or right side of an
= sign). These are NOT logical and/or, though a similar effect can be
induced through these operators.
- @attrname
- The named attribute of the nodes on the left hand side. Current attributes
are @heading for head1 through head4, and
@label for list items.
- [ expression ]
- Select only the left hand elements that match the expression in the
brackets. The expression will be evaluated from the point of view of each
node in the current result set.
Expressions can be:
- simple: "[/head2]"
- Any regular path will be true if there are any nodes matched. The above
example will be true if there are any head2 nodes as direct children of
the selected node.
- regex match: "[@heading =~ {FOO}]"
- A regex match will be true if the left hand expression has nodes that
match the regular expression between the braces on the right hand side.
The above example will match anything with a heading containing
"FOO".
Optionally, the right hand closing brace may have the
"i" modifier to cause case-insensitive
matching. i.e "[@heading =~ {foo}i]"
will match "foo" or
"fOO".
- complement: "[! /head2 ]"
- Reverses the remainder of the expression. The above example will match
anything without a child head2 node.
- compare operators: eg. "[ /node1 eq /node2 ]"
- Matches nodes where the operator is satistied for at least one pair of
nodes. The right hand expression can be a constant string (single quoted:
'string', or a second expression. If two
expressions are used, they are matched combinationally - i.e, all result
nodes on the left are matched against all result nodes on the right. Both
sides may contain nested expressions.
The following Perl compatible operators are supported:
String: " eq gt lt le ge ne
"
Numeric: "== < > <= >=
!="
Pod::Abstract::Path is not designed to be fast. It is designed to be expressive
and useful, but it involves sucessive expand/de-duplicate/linear search
operations and doing this with large documents containing many nodes is not
suitable for high performance systems.
Simple expressions can be fast enough, but there is nothing to
stop you from writing "//[<condition>]" and linear-searching
all 10,000 nodes of your Pod document. Use with caution in interactive
systems.
It is recommended you use the
"<Pod::Abstract::Node-"select>>
method to evaluate Path expressions.
If you wish to generate paths for use in other modules, use
"parse_path" to generate a parse tree,
pass that as an argument to "new", then
use "process" to evaluate the expression
against a list of nodes. You can re-use the same parse tree to process
multiple lists of nodes in this fashion.
It is possible during processing - especially using ^ or .. operators - to
generate many duplicate matches of the same nodes. Each pass around the loop,
we filter to unique nodes so that duplicates cannot inflate more than one
time.
This effectively means that
"//^" (however awful that is) will match
one node only - just really inefficiently.
Parse a list of lexemes and generate a driver tree for the process method. This
is a simple recursive descent parser with one element of lookahead.
Ben Lilburne <bnej@mac.com>
Copyright (C) 2009 Ben Lilburne
This program is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.