|
NAMEXML::XQL::Tutorial - Describes the XQL query syntaxDESCRIPTIONThis document describes basic the features of the XML Query Language (XQL.) A proposal for the XML Query Language (XQL) specification was submitted to the XSL Working Group in September 1998. The spec can be found at <http://www.w3.org/TandS/QL/QL98/pp/xql.html>. Since it is only a proposal at this point, things may change, but it is very likely that the final version will be close to the proposal. Most of this document was copied straight from the spec.See also the XML::XQL man page. INTRODUCTIONXQL (XML Query Language) provides a natural extension to the XSL pattern language. It builds upon the capabilities XSL provides for identifying classes of nodes, by adding Boolean logic, filters, indexing into collections of nodes, and more.XQL is designed specifically for XML documents. It is a general purpose query language, providing a single syntax that can be used for queries, addressing, and patterns. XQL is concise, simple, and powerful. XQL is designed to be used in many contexts. Although it is a superset of XSL patterns, it is also applicable to providing links to nodes, for searching repositories, and for many other applications. Note that the term XQL is a working term for the language described in this proposal. It is not their intent that this term be used permanently. Also, beware that another query language exists called XML-QL, which uses a syntax very similar to SQL. The XML::XQL module has added functionality to the XQL spec, called XQL+. To allow only XQL functionality as described in the spec, use the XML::XQL::Strict module. Note that the XQL spec makes the distinction between core XQL and XQL extensions. This implementation makes no distinction and the Strict module, therefore, implements everything described in the XQL spec. See the XML::XQL man page for more information about the Strict module. This tutorial will clearly indicate when referring to XQL+. XQL PatternsThis section describes the core XQL notation. These features should be part of every XQL implementation, and serve as the base level of functionality for its use in different technologies.The basic syntax for XQL mimics the URI directory navigation syntax, but instead of specifying navigation through a physical file structure, the navigation is through elements in the XML tree. For example, the following URI means find the foo.jpg file within the bar directory: bar/foo.jpg Similarly, in XQL, the following means find the collection of fuz elements within baz elements: baz/fuz Throughout this document you will find numerous samples. They refer to the data shown in the sample file at the end of this man page. ContextA context is the set of nodes against which a query operates. For the entire query, which is passed to the XML::XQL::Query constructor through the Expr option, the context is the list of input nodes that is passed to the query() method.XQL allows a query to select between using the current context as the input context and using the 'root context' as the input context. The 'root context' is a context containing only the root-most element of the document. When using XML::DOM, this is the Document object. By default, a query uses the current context. A query prefixed with '/' (forward slash) uses the root context. A query may optionally explicitly state that it is using the current context by using the './' (dot, forward slash) prefix. Both of these notations are analogous to the notations used to navigate directories in a file system. The './' prefix is only required in one situation. A query may use the '//' operator to indicate recursive descent. When this operator appears at the beginning of the query, the initial '/' causes the recursive decent to perform relative to the root of the document or repository. The prefix './/' allows a query to perform a recursive descent relative to the current context.
Query ResultsThe collection returned by an XQL expression preserves document order, hierarchy, and identity, to the extent that these are defined. That is, a collection of elements will always be returned in document order without repeats. Note that the spec states that the order of attributes within an element is undefined, but that this implementation does keep attributes in document order. See the XML::XQL man page for more details regarding Document Order.Collections - 'element' and '.'The collection of all elements with a certain tag name is expressed using the tag name itself. This can be qualified by showing that the elements are selected from the current context './', but the current context is assumed and often need not be noted explicitly.
Selecting children and descendants - '/' and '//'The collection of elements of a certain type can be determined using the path operators ('/' or '//'). These operators take as their arguments a collection (left side) from which to query elements, and a collection indicating which elements to select (right side). The child operator ('/')selects from immediate children of the left-side collection, while the descendant operator ('//') selects from arbitrary descendants of the left-side collection. In effect, the '//' can be thought of as a substitute for one or more levels of hierarchy. Note that the path operators change the context as the query is performed. By stringing them together users can 'drill down' into the document.
Collecting element children - '*'An element can be referenced without using its name by substituting the '*' collection. The '*' collection returns all elements that are children of the current context, regardless of their tag name.
Finding an attribute - '@'Attribute names are preceded by the '@' symbol. XQL is designed to treat attributes and sub-elements impartially, and capabilities are equivalent between the two types wherever possible.Note: attributes cannot contain subelements. Thus, attributes cannot have path operators applied to them in a query. Such expressions will result in a syntax error. The XQL spec states that attributes are inherently unordered and indices cannot be applied to them, but this implementation allows it.
XQL LiteralsXQL query expressions may contain literal values (i.e. constants.) Numbers (integers and floats) are wrapped in XML::XQL::Number objects and strings in XML::XQL::Text objects. Booleans (as returned by true() and false()) are wrapped in XML::XQL::Boolean objects.Strings must be enclosed in single or double quotes. Since XQL does not allow escaping of special characters, it's impossible to create a string with both a single and a double quote in it. To remedy this, XQL+ has added the q// and qq// string delimiters which behave just like they do in Perl. For Numbers, exponential notation is not allowed. Use the XQL+ function eval() to circumvent this problem. See XML::XQL man page for details. The empty list or undef is represented by [] (i.e. reference to empty array) in this implementation.
Grouping - '()'Parentheses can be used to group collection operators for clarity or where the normal precedence is inadequate to express an operation.Filters - '[]'Constraints and branching can be applied to any collection by adding a filter clause '[ ]' to the collection. The filter is analogous to the SQL WHERE clause with ANY semantics. The filter contains a query within it, called the subquery. The subquery evaluates to a Boolean, and is tested for each element in the collection. Any elements in the collection failing the subquery test are omitted from the result collection.For convenience, if a collection is placed within the filter, a Boolean TRUE is generated if the collection contains any members, and a FALSE is generated if the collection is empty. In essence, an expression such as author/degree implies a collection-to-Boolean conversion function like the following mythical 'there-exists-a' method. author[.there-exists-a(degree)] Note that any number of filters can appear at a given level of an expression. Empty filters are not allowed.
Any and all semantics - '$any$' and '$all$'Users can explicitly indicate whether to use any or all semantics through the $any$ and $all$ keywords.$any$ flags that a condition will hold true if any item in a set meets that condition. $all$ means that all elements in a set must meet the condition for the condition to hold true. $any$ and $all$ are keywords that appear before a subquery expression within a filter.
Indexing into a collection - '[]' and '$to$'XQL makes it easy to find a specific node within a set of nodes. Simply enclose the index ordinal within square brackets. The ordinal is 0 based.A range of elements can be returned. To do so, specify an expression rather than a single value inside of the subscript operator (square brackets). Such expressions can be a comma separated list of any of the following: n Returns the nth element -n Returns the element that is n-1 units from the last element. E.g., -1 means the last element. -2 is the next to last element. m $to$ n Returns elements m through n, inclusive
Boolean ExpressionsBoolean expressions can be used within subqueries. For example, one could use Boolean expressions to find all nodes of a particular value, or all nodes with nodes in particular ranges. Boolean expressions are of the form ${op}$, where {op} may be any expression of the form {b|a} - that is, the operator takes lvalue and rvalue arguments and returns a Boolean result.Note that the XQL Extensions section defines additional Boolean operations. Boolean AND and OR - '$and$' and '$or$'$and$ and $or$ are used to perform Boolean ands and ors.The Boolean operators, in conjunction with grouping parentheses, can be used to build very sophisticated logical expressions. Note that spaces are not significant and can be omitted, or included for clarity as shown here.
Boolean NOT - '$not$'$not$ is a Boolean operator that negates the value of an expression within a subquery.
Union and intersection - '$union$', '|' and '$intersect$'The $union$ operator (shortcut is '|') returns the combined set of values from the query on the left and the query on the right. Duplicates are filtered out. The resulting list is sorted in document order.Note: because this is a union, the set returned may include 0 or more elements of each element type in the list. To restrict the returned set to nodes that contain at least one of each of the elements in the list, use a filter, as discussed in Filters. The $intersect$ operator returns the set of elements in common between two sets.
Equivalence - '$eq$', '=', '$ne$' and '!='The '=' sign is used for equality; '!=' for inequality. Alternatively, $eq$ and $ne$ can be used for equality and inequality.Single or double quotes can be used for string delimiters in expressions. This makes it easier to construct and pass XQL from within scripting languages. For comparing values of elements, the value() method is implied. That is, last-name < 'foo' really means last-name!value() < 'foo'. Note that filters are always with respect to a context. That is, the expression book[author] means for every book element that is found, see if it has an author subelement. Likewise, book[author = 'Bob'] means for every book element that is found, see if it has a subelement named author whose value is 'Bob'. One can examine the value of the context as well, by using the . (period). For example, book[. = 'Trenton'] means for every book that is found, see if its value is 'Trenton'.
Comparison - '<', '<=', '>', '>=', '$lt', '$ilt$' etc.A set of binary comparison operators is available for comparing numbers and strings and returning Boolean results. $lt$, $le$, $gt$, $ge$ are used for less than, less than or equal, greater than, or greater than or equal. These same operators are also available in a case insensitive form: $ieq$, $ine$, $ilt$, $ile$, $igt$, $ige$.<, <=, > and >= are allowed short cuts for $lt$, $le$, $gt$ and $ge$.
XQL+ Match operators - '$match$', '$no_match$', '=~' and '!~'XQL+ defines additional operators for pattern matching. The $match$ operator (shortcut is '=~') returns TRUE if the lvalue matches the pattern described by the rvalue. The $no_match$ operator (shortcut is '!~') returns FALSE if they match. Both lvalue and rvalue are first cast to strings.The rvalue string should have the syntax of a Perl rvalue, that is the delimiters should be included and modifiers are allowed. When using delimiters other than slashes '/', the 'm' should be included. The rvalue should be a string, so don't forget the quotes! (Or use the q// or qq// delimiters in XQL+, see XML::XQL man page.) Note that you can't use the Perl substitution operator s/// here. Try using the XQL+ subst() function instead.
Oher XQL+ comparison operators - '$isa', '$can$'See the XML::XQL man page for other operators available in XQL+.Comparisons and vectorsThe lvalue of a comparison can be a vector or a scalar. The rvalue of a comparison must be a scalar or a value that can be cast at runtime to a scalar.If the lvalue of a comparison is a set, then any (exists) semantics are used for the comparison operators. That is, the result of a comparison is true if any item in the set meets the condition. Comparisons and literalsThe spec states that the lvalue of an expression cannot be a literal. That is, '1' = a is not allowed. This implementation allows it, but it's not clear how useful that is.Casting of literals during comparisonElements, attributes and other XML node types are casted to strings (Text) by applying the value() method. The value() method calls the text() method by default, but this behavior can be altered by the user, so the value() method may return other XQL data types.When two values are compared, they are first casted to the same type. See the XML::XQL man page for details on casting. Note that the XQL spec is not very clear on how values should be casted for comparison. Discussions with the authors of the XQL spec revealed that there was some disagreement and their implementations differed on this point. This implementation is closest to that of Joe Lapp from webMethods, Inc. Methods - 'method()' or 'query!method()'XQL makes a distinction between functions and methods. See the XML::XQL man page for details.XQL provides methods for advanced manipulation of collections. These methods provide specialized collections of nodes (see Collection methods), as well as information about sets and nodes. Methods are of the form method(arglist) Consider the query book[author]. It will find all books that have authors. Formally, we call the book corresponding to a particular author the reference node for that author. That is, every author element that is examined is an author for one of the book elements. (See the Annotated XQL BNF Appendix for a much more thorough definition of reference node and other terms. See also the XML::XQL man page.) Methods always apply to the reference node. For example, the text() method returns the text contained within a node, minus any structure. (That is, it is the concatenation of all text nodes contained with an element and its descendants.) The following expression will return all authors named 'Bob': author[text() = 'Bob'] The following will return all authors containing a first-name child whose text is 'Bob': author[first-name!text() = 'Bob'] The following will return all authors containing a child named Bob: author[*!text() = 'Bob'] Method names are case sensitive. See the XML::XQL man page on how to define your own methods and functions. Information methodsThe following methods provide information about nodes in a collection. These methods return strings or numbers, and may be used in conjunction with comparison operators within subqueries.
Collection index methods
Aggregate methods
Namespace methodsThe following methods can be applied to a node to return namespace information.
All attributes of an element can be returned using @*. This is potentially useful for applications that treat attributes as fields in a record.
FunctionsThis section defines the functions of XQL. The spec states that: XQL defines two kinds of functions: collection functions and pure functions. Collection functions use the search context of the Invocation instance, while pure functions ignore the search context, except to evaluate the function's parameters. A collection function evaluates to a subset of the search context, and a pure function evaluates to either a constant value or to a value that depends only on the function's parameters.Don't worry if you don't get it. Just use them! Collection functionsThe collection functions provide access to the various types of nodes in a document. Any of these collections can be constrained and indexed. The collections return the set of children of the reference node meeting the particular restriction.
Other XQL Functions
Sequence Operators - ';' and ';;'The whitepaper 'The Design of XQL' by Jonathan Robie, which can be found at <http://www.texcel.no/whitepapers/xql-design.html> describes the sequence operators ';;' (precedes) and ';' (immediately precedes.) Although these operators are not included in the XQL spec, I thought I'd add them anyway.Immediately Precedes - ';'
Note that in XML::DOM there is actually a text node with whitespace between the two TD nodes, but those are ignored by this operator, unless the text node has 'xml:space' set to 'preserve'. See ??? for details. Precedes - ';;'
Operator PrecedenceThe following table lists operators in precedence order, highest precedence first, where operators of a given row have the same precedence. The table also lists the associated productions:Production Operator(s) ---------- ----------- Grouping ( ) Filter [ ] Subscript [ ] Bang ! Path / // Match $match$ $no_match$ =~ !~ (XQL+ only) Comparison = != < <= > >= $eq$ $ne$ $lt$ $le$ $gt$ $ge$ $ieq$ $ine$ $ilt$ $ile$ $igt$ $ige$ Intersection $intersect$ Union $union$ | Negation $not$ Conjunction $and$ Disjunction $or$ Sequence ; ;; Sample XML Document - bookstore.xmlThis file is also stored in samples/bookstore.xml that comes with the XML::XQL distribution.<?xml version='1.0'?> <!-- This file represents a fragment of a book store inventory database --> <bookstore specialty='novel'> <book style='autobiography'> <title>Seven Years in Trenton</title> <author> <first-name>Joe</first-name> <last-name>Bob</last-name> <award>Trenton Literary Review Honorable Mention</award> </author> <price>12</price> </book> <book style='textbook'> <title>History of Trenton</title> <author> <first-name>Mary</first-name> <last-name>Bob</last-name> <publication> Selected Short Stories of <first-name>Mary</first-name> <last-name>Bob</last-name> </publication> </author> <price>55</price> </book> <magazine style='glossy' frequency='monthly'> <title>Tracking Trenton</title> <price>2.50</price> <subscription price='24' per='year'/> </magazine> <book style='novel' id='myfave'> <title>Trenton Today, Trenton Tomorrow</title> <author> <first-name>Toni</first-name> <last-name>Bob</last-name> <degree from='Trenton U'>B.A.</degree> <degree from='Harvard'>Ph.D.</degree> <award>Pulizer</award> <publication>Still in Trenton</publication> <publication>Trenton Forever</publication> </author> <price intl='canada' exchange='0.7'>6.50</price> <excerpt> <p>It was a dark and stormy night.</p> <p>But then all nights in Trenton seem dark and stormy to someone who has gone through what <emph>I</emph> have.</p> <definition-list> <term>Trenton</term> <definition>misery</definition> </definition-list> </excerpt> </book> <my:book style='leather' price='29.50' xmlns:my='http://www.placeholder-name-here.com/schema/'> <my:title>Who's Who in Trenton</my:title> <my:author>Robert Bob</my:author> </my:book> </bookstore> SEE ALSOThe Japanese version of this document can be found on-line at <http://member.nifty.ne.jp/hippo2000/perltips/xml/xql/tutorial.htm>XML::XQL, XML::XQL::Date, XML::XQL::Query and XML::XQL::DOM
Visit the GSP FreeBSD Man Page Interface. |