Mail::MboxParser::Mail::Body - rudimentary mail-body object
use Mail::MboxParser;
[...]
# $msg is a Mail::MboxParser::Mail
my $body = $msg->body(0);
# or preferably
my $body = $msg->body($msg->find_body);
for my $line ($body->signature) { print $line, "\n" }
for my $url ($body->extract_urls(unique => 1)) {
print $url->{url}, "\n";
print $url->{context}, "\n";
}
This class represents the body of an email-message. Since emails can have
multiple MIME-parts and each of these parts has a body it is not always easy
to say which part actually holds the text of the message (if there is any at
all). Mail::MboxParser::Mail::find_body will help and suggest a part.
- as_string ([strip_sig => 1])
- Returns the textual representation of the body as one string. Decoding
takes place when the mailbox has been opened using the decode => 'BODY'
| 'ALL' option.
If 'strip_sig' is set to a true value, the signature is
stripped from the string.
- as_lines ([strip_sig => 1])
- Sames as as_string() just that you get an array of lines with
newlines attached to each line.
NOTE: When the body is actually some encoded binary
data (most commonly such a body is base64-encoded), you can still use
this method. Then you wont really get proper lines. Instead you get
chunks of binary data that you should concatenate as in
my $binary = join "", $body->as_lines;
If 'strip_sig' is set to a true value, the signature is
stripped from the string.
- signature
- Returns the signature of a message as an array of lines. Trailing newlines
are already removed.
$body->error returns a string if no
signature has been found.
- extract_urls
- extract_urls (unique => 1)
- Returns an array of hash-refs. Each hash-ref has two fields: 'url' and
'context' where context is the line in which the 'url' appeared.
When calling it like
$mail->extract_urls(unique => 1),
duplicate URLs will be filtered out regardless of the 'context'. That's
useful if you just want a list of all URLs that can be found in your
mails.
$body->error() will return a
string if no URLs could be found within the body.
- quotes
- Returns a hash-ref of array-refs where the hash-keys are the several
levels of quotation. Each array-element contains the paragraphs of this
quotation-level as one string. Example:
my $quotes = $msg->body($msg->find_body)->quotes;
print $quotes->{1}->[0], "\n";
print $quotes->{0}->[0], "\n";
This should print the first paragraph of the mail-body that
has been quoted once and below that the paragraph that supposedly is the
reply to this paragraph. Perhaps thus:
> I had been trying to work with the CGI module
> but I didn't yet fully understand it.
Ah, it is tricky. Have you read the CGI-FAQ that
comes with the module?
Mark that empty lines will not be ignored and are part of the
lines contained in the array of
$quotes->{0}.
So below is a little code-snippet that should, in most cases,
restore the first 5 paragraphs (containing quote-level 0 and 1) of an
email:
for (0 .. 4) {
print $quotes->{0}->[$_];
print $quotes->{1}->[$_];
}
Since quotes() considers an empty line between two
quotes paragraphs as a paragraph in
$quotes->{0}, the paragraphs with one quote
and those with zero are balanced. That means:
scalar @{$quotes->{0}} - DIFF == scalar @{$quotes->{1}}
where DIFF is element of {-1, 0, 1}.
Unfortunately, quotes() can up to now only deal with
'>' as quotation-marks.
Tassilo von Parseval <tassilo.von.parseval@rwth-aachen.de>
Copyright (c) 2001-2005 Tassilo von Parseval. This program is free
software; you can redistribute it and/or modify it under the same terms as
Perl itself.