FileHandle::Unget - FileHandle which supports multi-byte unget
use FileHandle::Unget;
# open file handle
my $fh = FileHandle::Unget->new("file")
or die "cannot open filehandle: $!";
my $buffer;
read($fh,$buffer,100);
print $buffer;
print <$fh>;
$fh->close;
FileHandle::Unget operates exactly the same as FileHandle, except that it
provides a version of ungetc that allows you to unget more than one character.
It also provides ungets to unget a string.
This module is useful if the filehandle refers to a stream for
which you can't just "seek()" backwards.
Some operating systems support multi-byte
"ungetc()", but this is not guaranteed.
Use this module if you want a portable solution. In addition, on some
operating systems, eof() will not be reset if you ungetc after having
read to the end of the file.
NOTE: Using "sysread()" with
"ungetc()" and other buffering functions
is still a bad idea.
The methods for this package are the same as those of the FileHandle package,
with the following exceptions.
- new ( ARGS )
- The constructor is exactly the same as that of FileHandle, except that you
can also call it with an existing IO::Handle object to "attach"
unget semantics to a pre-existing handle.
- $fh->ungetc ( ORD )
- Pushes a character with the given ordinal value back onto the given
handle's input stream. This method can be called more than once in a row
to put multiple values back on the stream. Memory usage is equal to the
total number of bytes pushed back.
- $fh->ungets ( BUF )
- Pushes a buffer back onto the given handle's input stream. This method can
be called more than once in a row to put multiple buffers of characters
back on the stream. Memory usage is equal to the total number of bytes
pushed back.
The buffer is not processed in any way--managing end-of-line
characters and whatnot is your responsibility.
- $fh->buffer ( [BUF] )
- Get or set the pushback buffer directly.
- $fh->input_record_separator ( STRING )
- Get or set the per-filehandle input record separator. If an argument is
specified, the input record separator for the filehandle is made
independent of the global $/. Until this method is called (and after
clear_input_record_separator is called) the global $/ is used.
Note that a return value of "undef" is ambiguous. It
can either mean that this method has never been called with an argument,
or it can mean that it was called with an argument of
"undef".
- $fh->clear_input_record_separator ()
- Clear the per-filehandle input record separator. This removes the
per-filehandle input record separator semantics, reverting the filehandle
to the normal global $/ semantics.
- tell ( $fh )
- "tell" returns the actual file position
minus the length of the unget buffer. If you read three bytes, then unget
three bytes, "tell" will report a file
position of 0.
Everything works as expected if you are careful to unget the
exact same bytes which you read. However, things get tricky if you unget
different bytes. First, the next bytes you read won't be the actual
bytes on the filehandle at the position indicated by
"tell". Second,
"tell" will return a negative number
if you unget more bytes than you read. (This can be problematic since
this function returns -1 on error.)
- seek ( $fh, [POSITION], [WHENCE] )
- "seek" defaults to the standard seek if
possible, clearing the unget buffer if it succeeds. If the standard seek
fails, then "seek" will attempt to seek
within the unget buffer. Note that in this case, you will not be able to
seek backward--FileHandle::Unget will only save a buffer for the next
bytes to be read.
For example, let's say you read 10 bytes from a pipe, then
unget the 10 bytes. If you seek 5 bytes forward, you won't be able to
read the first five bytes. (Otherwise this module would have to keep
around a lot of probably useless data!)
To test that this module is indeed a drop-in replacement for FileHandle, the
following modules were modified to use FileHandle::Unget, and tested using
"make test". They have all passed.
There is a bug in Perl on Windows that is exposed if you open a stream, then
check for eof, then call binmode. For example:
# First line
# Second line
open FH, "$^X -e \"open F, '$0';binmode STDOUT;print <F>\" |";
eof(FH);
binmode(FH);
print "First line:", scalar <FH>, "\n";
print "Second line:", scalar <FH>, "\n";
close FH;
One solution is to make sure that you only call binmode
immediately after opening the filehandle. I'm not aware of any workaround
for this bug that FileHandle::Unget could implement. However, the module
does detect this situation and prints a warning.
Contact david@coppit.org for bug reports and suggestions.
David Coppit <david@coppit.org>.
This code is distributed under the GNU General Public License (GPL) Version 2.
See the file LICENSE in the distribution for details.
Mail::Mbox::MessageParser for an example of how to use this package.