Text::Sentence - module for splitting text into sentences
use Text::Sentence qw( split_sentences );
use locale;
use POSIX qw( locale_h );
setlocale( LC_CTYPE, 'iso_8859_1' );
@sentences = split_sentences( $text );
The "Text::Sentence" module contains the
function split_sentences, which splits text into its constituent sentences,
based on a fairly approximate regex. If you set the locale before calling it,
it will deal correctly with locale dependant capitalization to identify
sentence boundaries. Certain well know exceptions, such as abreviations, may
cause incorrect segmentations.
The split sentences function takes a scalar containing ascii text as an argument
and returns an array of sentences that the text has been split into.
@sentences = split_sentences( $text );
<https://github.com/neilb/HTML-Summary>
Ave Wrigley <wrigley@cre.canon.co.uk>
Copyright (c) 1997 Canon Research Centre Europe (CRE). All rights reserved.
This is free software; you can redistribute it and/or modify it
under the same terms as the Perl 5 programming language system itself.