Email::Folder::Mbox - reads raw RFC822 mails from an mbox file
This isa Email::Folder::Reader - read about its API there.
Does exactly what it says on the tin - fetches raw RFC822 mails from an mbox.
The mbox format is described at
http://www.qmail.org/man/man5/mbox.html
We attempt to read an mbox as through it's the mboxcl2 variant,
falling back to regular mbox mode if there is no
"Content-Length" header to be found.
The new constructor takes extra options.
- "fh"
- When filename is set to "FH" than
Email::Folder::Mbox will read mbox archive from filehandle
"fh" instead from disk file
"filename".
- "eol"
- This indicates what the line-ending style is to be. The default is
"\n", but for handling files with mac
line-endings you would want to specify "eol =>
"\x0d""
- "jwz_From_"
- The value is taken as a boolean that governs what is used match as a
message separator.
If false we use the mutt style
/^From \S+\s+(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun)/
/^From (?:Mon|Tue|Wed|Thu|Fri|Sat|Sun)/;
If true we use
/^From /
In deference to this extract from
<http://www.jwz.org/doc/content-length.html>
Essentially the only safe way to parse that file format is to
consider all lines which begin with the characters ``From ''
(From-space), which are preceded by a blank line or
beginning-of-file, to be the division between messages. That is, the
delimiter is "\n\nFrom .*\n" except for the very first message in the
file, where it is "^From .*\n".
Some people will tell you that you should do stricter parsing on
those lines: check for user names and dates and so on. They are
wrong. The random crap that has traditionally been dumped into that
line is without bound; comparing the first five characters is the
only safe and portable thing to do. Usually, but not always, the next
token on the line after ``From '' will be a user-id, or email
address, or UUCP path, and usually the next thing on the line will be
a date specification, in some format, and usually there's nothing
after that. But you can't rely on any of this.
Defaults to false.
- "unescape"
- This boolean value indicates whenever lines which starts with
/^>+From /
should be unescaped (= removed leading '>' char). This is
needed for mboxrd and mboxcl variants. But there is no way to detect for
used mbox variant, so default value is false.
- "seek_to"
- Seek to an offset when opening the mbox. When used in combination with
->tell you may be able to resume reading, with a trailing wind.
- "next_message"
- This returns next message as string
- "next_messageref"
- This returns next message as ref to string
- "tell"
- This returns the current filehandle position in the mbox.
- "next_from"
- This returns the From_ line for next message. Call it before
->next_message.
- "messageid"
- This returns the messageid of last read message. Call if after
->next_message.
- Simon Wistow <simon@thegestalt.org>
- Richard Clamp <richardc@unixbeard.net>
- Pali <pali@cpan.org>
This software is copyright (c) 2006 by Simon Wistow.
This is free software; you can redistribute it and/or modify it
under the same terms as the Perl 5 programming language system itself.