GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
Iterator::File(3) User Contributed Perl Documentation Iterator::File(3)

Iterator::File -- A file iterator, optionally stateful and verbose.

 use Iterator::File;
 
 ## Simplest form...
 $i = iterator_file( 'mydata.txt' );
 while( $i++ ) {
   &something_interesting( $i );
 }
 
 
 ## Disable auto-chomp, emit status, and allow us to resume if ^C...
 $i = iterator_file( 'mydata.txt',
                     'chomp'  => 0,
                     'status' => 1,
                     'resume' => 1,
                    );
 while( $i++ ) {
   &something_interesting( $i );
 }
 
 
 ## OO style...
 $i = iterator_file( 'mydata.txt' );
 while( $i->next() ) {
   &something_interesting( $i->value() );
 }

"Iterator_File" is an attempt to take some repetition & tedium out of processing a flat file. Whenever doing so, I found myself adapting prior scripts so that processes could be resumed, emit status, etc. Hence an itch (and this module) was born.

iterator_file($file, %config)
Returns an "Iterator::File" object. See %config section below for additional information on options.

new(%config)
The constructor returns a new "Iterator::File" object, handling arugment defaults & validation, and automatically invoking "initialize".
initialize()
Executes all startup work required before iteration. E.g., opening resources, detecting if a prior process terminated early & resuming, etc.
next(), '++'
Increment the iterator & return the new value.
value(), string context
Return the current value, without advancing.
advance_to( $location )
Advance the iterator to $location. If $location is behind the current location, behavior is undefined. (I.e., don't do that.)
finish()
Automatically invoked when the complete list is process. If the process dies before the last item of the list, this process is intentionally not invoked.

chmop
Automatically chomp each line. Default: enabled.
verbose
Enable verbose messaging for things such as temporary files. Default: disabled.

Note: for status messages, see "Status" below

debug
Enable debugging messages. It can also be enabled by setting the environmental variable ITERATOR_FILE_DEBUG to something true (to avoid modifying code to enable it). Default: disabled.

resume
If enabled, "Iterator::File" will keep track of which lines you've seen, even between invokations. That way if you program unexpectedly dies (e.g., via a bug or ^C), you can pick up where you left off just by running your program again. Default: disabled.
repeat_on_resume
If enabled, "Iterator::File" will error on the side of giving you the same line twice between invocations. E.g., if your program were to be restarted after dieing on the 100th line, "repeat_on_resume" would give you the 100th line on the 2nd invocation (verus the 101th). Default: disabled.
update_frequency
How often to update state. For very large data sets with light individual processing requirements, it may be worth setting to something other than 1. Default: 1.
state_class
Options: "Iterator::File::State::TempFile" and "Iterator::File::State::IPCShareable". TempFile is the default and in a lot of cases should be good enough. If you have philosophical objections to a frequently changing value living on disk (or a really, really slow disk), you can used shared memory via IPC::Sharable.

status_method
What algorithm to use to display status. Options are "emit_status_logarithmic", "emit_status_fixed_line_interval", and "emit_status_fixed_time_interval".

"emit_status_fixed_time_interval" will display status logarithmically. I.e., 1, 2, 3 ... 9, 10, 20, 30 ... 90, 100, 200, 300 ... 900, 1000, 2000, etc.

"emit_status_fixed_line_interval" display status every X lines, where X is defined by "status_line_interval".

"emit_status_fixed_time_interval" display status every X lines, where X is defined by "status_time_interval".

Default: emit_status_logarithmic.

status_line_interval
If "status_method" is "emit_status_fixed_line_interval", controls how frequently to display status. Default: 10 (lines).
status_time_interval
If "status_method" is "emit_status_time_line_interval", controls how frequently to display status. Default: 2 (seconds).
status_filehandle
Filehandle to use for printing status. Default: STDERR.
status_line
Format of status line. Default: "Processing row '%d'...\n".

Do not call chop or chomp on the iterator!! Unfortuntely, doing so destorys your object & leaves you with a plain ol' string. :(

Iterator::File

William Reardon, <wdr1@pobox.com>

Copyright (C) 2008 by William Reardon

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.

Hey! The above document had some coding errors, which are explained below:
Around line 276:
You forgot a '=back' before '=head1'
2008-06-18 perl v5.32.1

Search for    or go to Top of page |  Section 3 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.