|
|
| |
SAC(3) |
User Contributed Perl Documentation |
SAC(3) |
CSS::SAC - SAC CSS parser
use CSS::SAC qw();
use My::SACHandler ();
use My::SACErrors ();
my $doc_handler = My::SACHandler->new;
my $err_handler = My::SACErrors->new;
my $sac = CSS::SAC->new({
DocumentHandler => $doc_handler,
ErrorHandler => $err_handler,
});
# generate a stream of events
$sac->parse({ filename => 'foo.css' });
SAC (Simple API for CSS) is an event-based API much like SAX for XML. If you are
familiar with the latter, you should have little trouble getting used to SAC.
More information on SAC can be found online at http://www.w3.org/TR/SAC.
CSS having more constructs than XML, core SAC is still more
complex than core SAX. However, if you need to parse a CSS style sheet, SAC
probably remains the easiest way to get it done.
Most of the spec is presently implemented. The following
interfaces are not yet there: Locator, CSSException, CSSParseException,
ParserFactory. They may or may not be implemented at a later date (the most
likely candidates are the exception classes, for which I still have to find
an appropriate model).
Some places differ slightly from what is in the spec. I have tried
to keep those to a justified minimum and to flag them correctly.
The Parser class doesn't exist separately, it's defined in CSS::SAC. It doesn't
expose the locale interface because we don't localize errors (yet). It also
doesn't have "parse_style_sheet" but rather
"parse", which is more consistent with other
Perl parsing interfaces.
I have added the
"charset($charset)" callback to the
DocumentHandler interface. There are valid reasons why it wasn't there (it
can be trusted only ever so often, and one should look at the actual
encoding instead) but given that it's a token in the grammar, I believe that
there should still be a way to access it.
- CSS::SAC->new(\%options) or
$sac->new(\%options)
Constructs a new parser object. The options can be:
- ConditionFactory and SelectorFactory
the factory classes used to build selector and condition objects.
See CSS::SAC::{Condition,Selector}Factory for more details on the
interfaces those classes must expose.
- DocumentHandler and ErrorHandler
the handler classes used as sinks for the event stream received
from a SAC Driver. See CSS::SAC::{Document,Error}Factory for more
details on the interfaces those classes must expose.
Methods will be called on whatever it is you pass as values to
those options. Thus, you may pass in objects as well as class names (I
haven't tested this yet, there may be a problem).
NOTE: an error handler should implement all callbacks, while a
document handler may only implement those it is interested in. There is
a default error handler (which dies and warns depending on the type of
error) but not default document handler.
- $sac->ParserVerion or
$sac->getParserVerion
Returns the supported CSS version.
Requesting this parser's ParserVersion will return the string
'CSS3'. While that is (modulo potential bugs of course) believed to be
generally true, several caveats apply:
To begin with, CSS3 has been modularised, and various modules
are at different stages of development. Evolving modules may require
evolving this parser. I hesitated between making ParserVersion return
CSS2, CSS3-pre, or simply CSS3. I chose the latter because I intend to
update it as I become aware of the necessity of changes to accommodate
new CSS3 stuff, and because it already supports a number of constructs
alien to CSS2 (of which namespaces is imho important enough to justify a
CSS3 tag). If you are aware of incompatibilities, please contact me.
More importantly, it is now considered wrong for a parser to
return CSSx as its version and instead it is expected to return an uri
corresponding to the uri of the CSS version that it supports. However,
there is no uri for CSS3, but instead one uri per module. While this
issue hasn't been resolved by the WG, I will stick to returning CSS3.
However, the behaviour of this attribute is certain to change in
the future, so please avoid relying on it.
- $cf =
$sac->ConditionFactory
- $sac->ConditionFactory($cf) or
$sac->setConditionFactory($cf)
- $cf =
$sac->SelectorFactory
- $sac->SelectorFactory($sf) or
$sac->setSelectorFactory($sf)
- $cf =
$sac->DocumentHandler
- $sac->DocumentHandler($dh) or
$sac->setDocumentHandler($dh)
- $cf =
$sac->ErrorHandler
- $sac->ErrorHandler($eh) or
$sac->setErrorHandler($eh)
get/set the ConditionFactory, SelectorFactory,
DocumentHandler, ErrorHandler that we use
- $sac->parse(\%options)
- $sac->parseStyleSheet(\%options)
parses a style sheet and sends events to the defined handlers.
The options that you can use are:
- string
- ioref
- filename
passes either a string, an open filehandle, or a filename to
read the stylesheet from
- embedded
tells whether the stylesheet is embedded or not. This is most
of the time useless but it will influence the interpretation of
@charset rules. The latter being forbidden in
embedded style sheets they will generate an ignorable_style_sheet event
instead of a charset event if embedded is set to a true value.
- $sac->parse_rule($string_ref)
- $sac->parseRule($string_ref)
parses a rule (with { and }). You probably don't need this
one. It returns nothing, but generates the events.
- $sac->parse_style_declaration($string_ref)
- $sac->parseStyleDeclaration($string_ref)
same as parse_rule, but without the { and }. This is useful
when you want to parse style declarations embedded using style
attributes in HTML, SVG, etc... It returns nothing, but generates the
events.
- $sac->parse_property_value($string_ref)
- $sac->parsePropertyValue($string_ref)
parses a property value and returns an array ref of lexical
units (see CSS::SAC::LexicalUnit)
- $sac->parse_priority($string_ref)
- $sac->parsePriority($string_ref)
parses a priority and returns true if there is a priority
value there.
- $sac->parse_selector_list($string_ref)
- $sac->parseSelectors($string_ref)
parses a list of selectors and returns an array ref of
selectors
Methods in this section are of relevance mostly to the internal workings of the
parser. I document them here but I don't really consider them part of the
interface, and thus may change them if need be. If you are using them directly
tell me about it and I will "officialize" them. These have no Java
style equivalent.
- $sac->parse_charset($string_ref)
parses a charset. It returns nothing, but generates the
events.
- $sac->parse_imports($string_ref)
parses import rules. It returns nothing, but generates the
events.
- $sac->parse_namespace_declarations($string_ref)
parses ns declarations. It returns nothing, but generates the
events.
- $sac->parse_medialist($string_ref)
parses a list of media values and returns that list as an
arrayref
- $sac->parse_comments($string_ref)
parses as many comments as there are at the beginning of the
string. It returns nothing, but generates the events.
- $sac->parse_simple_selector($string_ref)
parses a simple selector and returns the selector object
- $sac->build_condition(\@tokens)
helper to build conditions (you probably don't want to use
this at all...)
This is pretty much a non package, it is just there to provide the default error
handler if you are too lazy to provide one yourself.
All it does is pretty simple. There are three error levels:
"warning",
"error", and
"fatal_error". What it does is warn on the
two first and die on the last. Yes, it ain't fancy but then you can plug
anything more intelligent into it at any moment.
One problem is that I have modelled this parser after existing SAC
implementations that do not take into account as much of CSS3 as it is
possible to. Some parts of that are trivial, and I have provided support on my
own in this module. Other parts though are more important and I believe that
coordination between the SAC authors would be beneficial on these points (once
the relevant CSS3 modules will have moved to REC).
- new attribute conditions
CSS3-selectors introduces a bunch of new things, including new
attribute conditions ^= (starts with), $= (ends with) and *= (contains).
There are no corresponding constants for conditions, so I suggested
SAC_STARTS_WITH_ATTRIBUTE_CONDITION, SAC_ENDS_WITH_ATTRIBUTE_CONDITION,
SAC_CONTAINS_ATTRIBUTE_CONDITION.
Note that these constants have been added, together with the
corresponding factory methods. However, they will remain undocumented
and considered experimental until some consensus is reached on the
matter.
- :root condition
The :root token confuses some people because they think it is
equivalent to XPath's / root step. That is not so. XPath's root selects
"above" the document element. CSS's :root tests whether an
element is the document element, there is nothing above a document
element. Thus :root on its own is equivalent to *:root. It's a
condition, not a selector. E:root matches the E element that is also the
document element (if there is one).
Thus, SAC_ROOT_NODE_SELECTOR does not apply and we need a new
SAC_IS_ROOT_CONDITION constant.
Note that this constant has been added, together with the
corresponding factory method. However, it will remain undocumented and
considered experimental until some consensus is reached on the
matter.
- other new pseudo-classes
:empty definitely needs a constant too I'd say.
Note that this constant has been added, together with the
corresponding factory method. However, it will remain undocumented and
considered experimental until some consensus is reached on the
matter.
- an+b syntax in positional conditions
There is new syntax that allows for very customisable
positional selecting. PositionalCondition needs to be updated to deal
with that.
- the problem with attaching pseudo-elements to elements as
coselectors. I'm not sure which is the right representation. Don't
forget to update CSS::SAC::Writer too so that it writes it out
properly.
- see Bjoern's list
- Bjoern Hoehrmann for his immediate reaction and much valuable
feedback and suggestions. It's certainly much harder to type with all
those fingers that all those Mafia padres have cut off, but at least
I get work done much faster than before. And also those nasty bugs he
kindly uncovered.
- Steffen Goeldner for spotting bugs and providing patches.
- Ian Hickson for very very very kind testing support, and all sorts
of niceties.
- Manos Batsis for starting a very long discussion on this that
eventually deviated into other very interesting topics, and for
giving me some really weird style sheets to feed into this module.
- Simon St.Laurent for posting this on xmlhack.com and thus pointing a
lot of people to this module (as seen in my referer logs).
And of course all the other people that have sent encouragement
notes and feature requests.
- add a pointer to the SAC W3 page
- create the Exception classes
- update PositionalCondition to include logic that can normalize the
an+n notation and add a method that given a position will return a
boolean indicating whether it matches the condition.
- add stringify overloading to all classes so that they may be
printed directly
- have parser version return an overloaded object that circumvents the
current problems
- add docs on how to write a {Document,Error}Handler, right now there
is example code in Writer, but it isn't all clearly explained.
- find a way to make the '-' prefix to properties optional
- add a filter that switches events to spec names, and that can be used
directly through an option
- add DOM-like hasFeature support (in view of SAC 3)
- prefix all constants with SAC_. Keep the old ones around for a few
versions, importable with :old-constants.
- update docs
Robin Berjon <robin@knowscape.com>
This module is licensed under the same terms as Perl itself.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |