GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
HTML::FormatText::Html2text(3) User Contributed Perl Documentation HTML::FormatText::Html2text(3)

HTML::FormatText::Html2text - format HTML as plain text using html2text

 use HTML::FormatText::Html2text;
 $text = HTML::FormatText::Html2text->format_file ($filename);
 $text = HTML::FormatText::Html2text->format_string ($html_string);

 $formatter = HTML::FormatText::Html2text->new;
 $tree = HTML::TreeBuilder->new_from_file ($filename);
 $text = $formatter->format ($tree);

"HTML::FormatText::Html2text" turns HTML into plain text using the "html2text" program.

<http://www.mbayer.de/html2text/>

The module interface is compatible with formatters like "HTML::FormatText", but all parsing etc is done by html2text.

See "HTML::FormatExternal" for the formatting functions and options, with the following caveats,

"input_charset"
Currently this option has no effect. Input generally has to be latin-1 only, though the Debian extended "html2ext" interprets a "<meta>" charset directive in the HTML header.

Various "&" style named or numbered entities are recognised and result in suitable output. The suggestion would be entitized input for maximum portability among "html2text" versions.

"output_charset"
If set to "ascii" or "ANSI_X3.4-1968" (both case-insensitive) the "html2text -ascii" option is used, when available ("html2text" 1.3.2 from Jan 2004).

If set to "UTF-8" then Debian extension "-utf8" option is used (circa 2009).

Apart from this there's no control over the output charset.

HTML::FormatExternal, html2text(1)

<http://user42.tuxfamily.org/html-formatexternal/index.html>

Copyright 2008, 2009, 2010, 2013, 2015 Kevin Ryde

HTML-FormatExternal is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version.

HTML-FormatExternal is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with HTML-FormatExternal. If not, see <http://www.gnu.org/licenses/>.

2015-08-06 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.