|
NAMEHTML::Stream - HTML output stream class, and some markup utilitiesSYNOPSISHere's small sample of some of the non-OO ways you can use this module:use HTML::Stream qw(:funcs); print html_tag('A', HREF=>$link); print html_escape("<<Hello & welcome!>>"); And some of the OO ways as well: use HTML::Stream; $HTML = new HTML::Stream \*STDOUT; # The vanilla interface... $HTML->tag('A', HREF=>"$href"); $HTML->tag('IMG', SRC=>"logo.gif", ALT=>"LOGO"); $HTML->text($copyright); $HTML->tag('_A'); # The chocolate interface... $HTML -> A(HREF=>"$href"); $HTML -> IMG(SRC=>"logo.gif", ALT=>"LOGO"); $HTML -> t($caption); $HTML -> _A; # The chocolate interface, with whipped cream... $HTML -> A(HREF=>"$href") -> IMG(SRC=>"logo.gif", ALT=>"LOGO") -> t($caption) -> _A; # The strawberry interface... output $HTML [A, HREF=>"$href"], [IMG, SRC=>"logo.gif", ALT=>"LOGO"], $caption, [_A]; DESCRIPTIONThe HTML::Stream module provides you with an object-oriented (and subclassable) way of outputting HTML. Basically, you open up an "HTML stream" on an existing filehandle, and then do all of your output to the HTML stream. You can intermix HTML-stream-output and ordinary-print-output, if you like.There's even a small built-in subclass, HTML::Stream::Latin1, which can handle Latin-1 input right out of the box. But all in good time... INTRODUCTION (the Neapolitan dessert special)Function interfaceLet's start out with the simple stuff. This module provides a collection of non-OO utility functions for escaping HTML text and producing HTML tags, like this:use HTML::Stream qw(:funcs); # imports functions from @EXPORT_OK print html_tag(A, HREF=>$url); print '© 1996 by', html_escape($myname), '!'; print html_tag('/A'); By the way: that last line could be rewritten as: print html_tag(_A); And if you need to get a parameter in your tag that doesn't have an associated value, supply the undefined value (not the empty string!): print html_tag(TD, NOWRAP=>undef, ALIGN=>'LEFT'); <TD NOWRAP ALIGN=LEFT> print html_tag(IMG, SRC=>'logo.gif', ALT=>''); <IMG SRC="logo.gif" ALT=""> There are also some routines for reversing the process, like: $text = "This <i>isn't</i> "fun"..."; print html_unmarkup($text); This isn't "fun"... print html_unescape($text); This isn't "fun"... Yeah, yeah, yeah, I hear you cry. We've seen this stuff before. But wait! There's more... OO interface, vanillaUsing the function interface can be tedious... so we also provide an "HTML output stream" class. Messages to an instance of that class generally tell that stream to output some HTML. Here's the above example, rewritten using HTML streams:use HTML::Stream; $HTML = new HTML::Stream \*STDOUT; $HTML->tag(A, HREF=>$url); $HTML->ent('copy'); $HTML->text(" 1996 by $myname!"); $HTML->tag(_A); As you've probably guessed: text() Outputs some text, which will be HTML-escaped. tag() Outputs an ordinary tag, like <A>, possibly with parameters. The parameters will all be HTML-escaped automatically. ent() Outputs an HTML entity, like the © or < . You mostly don't need to use it; you can often just put the Latin-1 representation of the character in the text(). You might prefer to use "t()" and "e()" instead of "text()" and "ent()": they're absolutely identical, and easier to type: $HTML -> tag(A, HREF=>$url); $HTML -> e('copy'); $HTML -> t(" 1996 by $myname!"); $HTML -> tag(_A); Now, it wouldn't be nice to give you those "text()" and "ent()" shortcuts without giving you one for "tag()", would it? Of course not... OO interface, chocolateThe known HTML tags are even given their own tag-methods, compiled on demand. The above code could be written even more compactly as:$HTML -> A(HREF=>$url); $HTML -> e('copy'); $HTML -> t(" 1996 by $myname!"); $HTML -> _A; As you've probably guessed: A(HREF=>$url) == tag(A, HREF=>$url) == <A HREF="/the/url"> _A == tag(_A) == </A> All of the autoloaded "tag-methods" use the tagname in all-uppercase. A "_" prefix on any tag-method means that an end-tag is desired. The "_" was chosen for several reasons: (1) it's short and easy to type, (2) it doesn't produce much visual clutter to look at, (3) "_TAG" looks a little like "/TAG" because of the straight line.
I should stress that this module will only auto-create tag methods for known HTML tags. So you're protected from typos like this (which will cause a fatal exception at run-time): $HTML -> IMGG(SRC=>$src); (You're not yet protected from illegal tag parameters, but it's a start, ain't it?) If you need to make a tag known (sorry, but this is currently a global operation, and not stream-specific), do this: accept_tag HTML::Stream 'MARQUEE'; # for you MSIE fans... Note: there is no corresponding "reject_tag". I thought and thought about it, and could not convince myself that such a method would do anything more useful than cause other people's modules to suddenly stop working because some bozo function decided to reject the "FONT" tag. OO interface, with whipped creamIn the grand tradition of C++, output method chaining is supported in both the Vanilla Interface and the Chocolate Interface. So you can (and probably should) write the above code as:$HTML -> A(HREF=>$url) -> e('copy') -> t(" 1996 by $myname!") -> _A; But wait! Neapolitan ice cream has one more flavor... OO interface, strawberryI was jealous of the compact syntax of HTML::AsSubs, but I didn't want to worry about clogging the namespace with a lot of functions like p(), a(), etc. (especially when markup-functions like tr() conflict with existing Perl functions). So I came up with this:output $HTML [A, HREF=>$url], "Here's my $caption", [_A]; Conceptually, arrayrefs are sent to "html_tag()", and strings to "html_escape()". ADVANCED TOPICSAuto-formatting and inserting newlinesAuto-formatting is the name I give to the Chocolate Interface feature whereby newlines (and maybe, in the future, other things) are inserted before or after the tags you output in order to make your HTML more readable. So, by default, this:$HTML -> HTML -> HEAD -> TITLE -> t("Hello!") -> _TITLE -> _HEAD -> BODY(BGCOLOR=>'#808080'); Actually produces this: <HTML><HTML> <HEAD> <TITLE>Hello!</TITLE> </HEAD> <BODY BGCOLOR="#808080"> To turn off autoformatting altogether on a given HTML::Stream object, use the "auto_format()" method: $HTML->auto_format(0); # stop autoformatting! To change whether a newline is automatically output before/after the begin/end form of a tag at a global level, use "set_tag()": HTML::Stream->set_tag('B', Newlines=>15); # 15 means "\n<B>\n \n</B>\n" HTML::Stream->set_tag('I', Newlines=>7); # 7 means "\n<I>\n \n</I> " To change whether a newline is automatically output before/after the begin/end form of a tag for a given stream level, give the stream its own private "tag info" table, and then use "set_tag()": $HTML->private_tags; $HTML->set_tag('B', Newlines=>0); # won't affect anyone else! To output newlines explicitly, just use the special "nl" method in the Chocolate Interface: $HTML->nl; # one newline $HTML->nl(6); # six newlines I am sometimes asked, "why don't you put more newlines in automatically?" Well, mostly because...
So I've stuck to outputting newlines in places where it's most likely to be harmless. EntitiesAs shown above, You can use the "ent()" (or "e()") method to output an entity:$HTML->t('Copyright ')->e('copy')->t(' 1996 by Me!'); But this can be a pain, particularly for generating output with non-ASCII characters: $HTML -> t('Copyright ') -> e('copy') -> t(' 1996 by Fran') -> e('ccedil') -> t('ois, Inc.!'); Granted, Europeans can always type the 8-bit characters directly in their Perl code, and just have this: $HTML -> t("Copyright \251 1996 by Fran\347ois, Inc.!'); But folks without 8-bit text editors can find this kind of output cumbersome to generate. Sooooooooo... Auto-escaping: changing the way text is escapedAuto-escaping is the name I give to the act of taking an "unsafe" string (one with ">", "&", etc.), and magically outputting "safe" HTML.The default "auto-escape" behavior of an HTML stream can be a drag if you've got a lot character entities that you want to output, or if you're using the Latin-1 character set, or some other input encoding. Fortunately, you can use the "auto_escape()" method to change the way a particular HTML::Stream works at any time. First, here's a couple of special invocations: $HTML->auto_escape('ALL'); # Default; escapes [<>"&] and 8-bit chars. $HTML->auto_escape('LATIN_1'); # Like ALL, but uses Latin-1 entities # instead of decimal equivalents. $HTML->auto_escape('NON_ENT'); # Like ALL, but leaves "&" alone. You can also install your own auto-escape function (note that you might very well want to install it for just a little bit only, and then de-install it): sub my_auto_escape { my $text = shift; HTML::Entities::encode($text); # start with default $text =~ s/\(c\)/©/ig; # (C) becomes copyright $text =~ s/\\,(c)/\&$1cedil;/ig; # \,c becomes a cedilla $text; } # Start using my auto-escape: my $old_esc = $HTML->auto_escape(\&my_auto_escape); # Output some stuff: $HTML-> IMG(SRC=>'logo.gif', ALT=>'Fran\,cois, Inc'); output $HTML 'Copyright (C) 1996 by Fran\,cois, Inc.!'; # Stop using my auto-escape: $HTML->auto_escape($old_esc); If you find yourself in a situation where you're doing this a lot, a better way is to create a subclass of HTML::Stream which installs your custom function when constructed. For an example, see the HTML::Stream::Latin1 subclass in this module. Outputting HTML to things besides filehandlesAs of Revision 1.21, you no longer need to supply "new()" with a filehandle: any object that responds to a print() method will do. Of course, this includes blessed FileHandles, and IO::Handles.If you supply a GLOB reference (like "\*STDOUT") or a string (like "Module::FH"), HTML::Stream will automatically create an invisible object for talking to that filehandle (I don't dare bless it into a FileHandle, since the underlying descriptor would get closed when the HTML::Stream is destroyed, and you might not want that). You say you want to print to a string? For kicks and giggles, try this: package StringHandle; sub new { my $self = ''; bless \$self, shift; } sub print { my $self = shift; $$self .= join('', @_); } package main; use HTML::Stream; my $SH = new StringHandle; my $HTML = new HTML::Stream $SH; $HTML -> H1 -> t("Hello & <<welcome>>!") -> _H1; print "PRINTED STRING: ", $$SH, "\n"; SubclassingThis is where you can make your application-specific HTML-generating code much easier to look at. Consider this:package MY::HTML; @ISA = qw(HTML::Stream); sub Aside { $_[0] -> FONT(SIZE=>-1) -> I; } sub _Aside { $_[0] -> _I -> _FONT; } Now, you can do this: my $HTML = new MY::HTML \*STDOUT; $HTML -> Aside -> t("Don't drink the milk, it's spoiled... pass it on...") -> _Aside; If you're defining these markup-like, chocolate-interface-style functions, I recommend using mixed case with a leading capital. You probably shouldn't use all-uppercase, since that's what this module uses for real HTML tags. PUBLIC INTERFACEFunctions
Vanilla
Returns the previously-installed function, in the manner of "select()". No arguments just returns the currently-installed function.
Strawberry
Chocolate
The TAGINFO, if given, is a set of key=>value pairs with the following possible keys:
Returns the self object on success.
SUBCLASSESHTML::Stream::Latin1A small, public package for outputting Latin-1 markup. Its default auto-escape function is "LATIN_1", which tries to output the mnemonic entity markup (e.g., "ç") for ISO-8859-1 characters.So using HTML::Stream::Latin1 like this: use HTML::Stream; $HTML = new HTML::Stream::Latin1 \*STDOUT; output $HTML "\253A right angle is 90\260, \277No?\273\n"; Prints this: «A right angle is 90°, ¿No?» Instead of what HTML::Stream would print, which is this: «A right angle is 90°, ¿No?» Warning: a lot of Latin-1 HTML markup is not recognized by older browsers (e.g., Netscape 2.0). Consider using HTML::Stream; it will output the decimal entities which currently seem to be more "portable". Note: using this class "requires" that you have HTML::Entities. PERFORMANCESlower than I'd like. Both the output() method and the various "tag" methods seem to run about 5 times slower than the old just-hardcode-the-darn stuff approach. That is, in general, this:### Approach #1... tag $HTML 'A', HREF=>"$href"; tag $HTML 'IMG', SRC=>"logo.gif", ALT=>"LOGO"; text $HTML $caption; tag $HTML '_A'; text $HTML $a_lot_of_text; And this: ### Approach #2... output $HTML [A, HREF=>"$href"], [IMG, SRC=>"logo.gif", ALT=>"LOGO"], $caption, [_A]; output $HTML $a_lot_of_text; And this: ### Approach #3... $HTML -> A(HREF=>"$href") -> IMG(SRC=>"logo.gif", ALT=>"LOGO") -> t($caption) -> _A -> t($a_lot_of_text); Each run about 5x slower than this: ### Approach #4... print '<A HREF="', html_escape($href), '>', '<IMG SRC="logo.gif" ALT="LOGO">', html_escape($caption), '</A>'; print html_escape($a_lot_of_text); Of course, I'd much rather use any of first three (especially #3) if I had to get something done right in a hurry. Or did you not notice the typo in approach #4? ";-)" (BTW, thanks to Benchmark:: for allowing me to... er... benchmark stuff.) VERSION$Id: Stream.pm,v 1.60 2008/08/06 dstaal Exp $CHANGE LOG
COPYRIGHTThis program is free software. You may copy or redistribute it under the same terms as Perl itself.ACKNOWLEDGEMENTSWarmest thanks to...Eryq For writing the orginal version of this module. John Buckman For suggesting that I write an "html2perlstream", and inspiring me to look at supporting Latin-1. Tony Cebzanov For suggesting that I write an "html2perlstream" John D Groenveld Bug reports, patches, and suggestions B. K. Oxley (binkley) For suggesting the support of "writing to strings" which became the "printable" interface. AUTHORDaniel T. Staal (DStaal@usa.net).Enjoy. Yell if it breaks.
Visit the GSP FreeBSD Man Page Interface. |