|
NAMEMCE::Child - A threads-like parallelization module compatible with Perl 5.8 VERSIONThis document describes MCE::Child version 1.901 SYNOPSIS use MCE::Child;
MCE::Child->init(
max_workers => 'auto', # default undef, unlimited
# Specify a percentage. MCE::Child 1.876+.
max_workers => '25%', # 4 on HW with 16 lcores
max_workers => '50%', # 8 on HW with 16 lcores
child_timeout => 20, # default undef, no timeout
posix_exit => 1, # default undef, CORE::exit
void_context => 1, # default undef
on_start => sub {
my ( $pid, $ident ) = @_;
...
},
on_finish => sub {
my ( $pid, $exit, $ident, $signal, $error, @ret ) = @_;
...
}
);
MCE::Child->create( sub { print "Hello from child\n" } )->join();
sub parallel {
my ($arg1) = @_;
print "Hello again, $arg1\n" if defined($arg1);
print "Hello again, $_\n"; # same thing
}
MCE::Child->create( \¶llel, $_ ) for 1 .. 3;
my @procs = MCE::Child->list();
my @pids = MCE::Child->list_pids();
my @running = MCE::Child->list_running();
my @joinable = MCE::Child->list_joinable();
my @count = MCE::Child->pending();
# Joining is orderly, e.g. child1 is joined first, child2, child3.
$_->join() for @procs; # (or)
$_->join() for @joinable;
# Joining occurs immediately as child processes complete execution.
1 while MCE::Child->wait_one();
my $child = mce_child { foreach (@files) { ... } };
$child->join();
if ( my $err = $child->error() ) {
warn "Child error: $err\n";
}
# Get a child's object
$child = MCE::Child->self();
# Get a child's ID
$pid = MCE::Child->pid(); # $$
$pid = $child->pid();
$pid = MCE::Child->tid(); # tid is an alias for pid
$pid = $child->tid();
# Test child objects
if ( $child1 == $child2 ) {
...
}
# Give other workers a chance to run
MCE::Child->yield();
MCE::Child->yield(0.05);
# Return context, wantarray aware
my ($value1, $value2) = $child->join();
my $value = $child->join();
# Check child's state
if ( $child->is_running() ) {
sleep 1;
}
if ( $child->is_joinable() ) {
$child->join();
}
# Send a signal to a child
$child->kill('SIGUSR1');
# Exit a child
MCE::Child->exit(0);
MCE::Child->exit(0, @ret);
DESCRIPTIONMCE::Child is a fork of MCE::Hobo for compatibility with Perl 5.8. A child is a migratory worker inside the machine that carries the asynchronous gene. Child processes are equipped with "threads"-like capability for running code asynchronously. Unlike threads, each child is a unique process to the underlying OS. The IPC is handled via "MCE::Channel", which runs on all the major platforms including Cygwin and Strawberry Perl. "MCE::Child" may be used as a standalone or together with "MCE" including running alongside "threads". use MCE::Child;
use MCE::Shared;
# synopsis: head -20 file.txt | perl script.pl
my $ifh = MCE::Shared->handle( "<", \*STDIN ); # shared
my $ofh = MCE::Shared->handle( ">", \*STDOUT );
my $ary = MCE::Shared->array();
sub parallel_task {
my ( $id ) = @_;
while ( <$ifh> ) {
printf {$ofh} "[ %4d ] %s", $., $_;
# $ary->[ $. - 1 ] = "[ ID $id ] read line $.\n" ); # dereferencing
$ary->set( $. - 1, "[ ID $id ] read line $.\n" ); # faster via OO
}
}
my $child1 = MCE::Child->new( "parallel_task", 1 );
my $child2 = MCE::Child->new( \¶llel_task, 2 );
my $child3 = MCE::Child->new( sub { parallel_task(3) } );
$_->join for MCE::Child->list(); # ditto: MCE::Child->wait_all();
# search array (total one round-trip via IPC)
my @vals = $ary->vals( "val =~ / ID 2 /" );
print {*STDERR} join("", @vals);
API DOCUMENTATION
THREADS-like DETACH CAPABILITYThreads-like detach capability was added starting with the 1.867 release. A threads example is shown first followed by the MCE::Child example. All one needs to do is set the CHLD signal handler to IGNORE. Unfortunately, this works on UNIX platforms only. The child process restores the CHLD handler to default, so is able to deeply spin workers and reap if desired. use threads;
for ( 1 .. 8 ) {
async {
# do something
}->detach;
}
use MCE::Child;
# Have the OS reap workers automatically when exiting.
# The on_finish option is ignored if specified (no-op).
# Ensure not inside a thread on UNIX platforms.
$SIG{CHLD} = 'IGNORE';
for ( 1 .. 8 ) {
mce_child {
# do something
};
}
# Optionally, wait for any remaining workers before leaving.
# This is necessary if workers are consuming shared objects,
# constructed via MCE::Shared.
MCE::Child->wait_all;
The following is another way and works on Windows. Here, the on_finish handler works as usual. use MCE::Child;
MCE::Child->init(
on_finish = sub {
...
},
);
for ( 1 .. 8 ) {
$_->join for MCE::Child->list_joinable;
mce_child {
# do something
};
}
MCE::Child->wait_all;
PARALLEL::FORKMANAGER-like DEMONSTRATIONMCE::Child behaves similarly to threads for the most part. It also provides Parallel::ForkManager-like capabilities. The "Parallel::ForkManager" example is shown first followed by a version using "MCE::Child".
PARALLEL HTTP GET DEMONSTRATION USING ANYEVENTThis demonstration constructs two queues, two handles, starts the shared-manager process if needed, and spawns four workers. For this demonstration, am chunking 64 URLs per job. In reality, one may run with 200 workers and chunk 300 URLs on a 24-way box. # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# perl demo.pl -- all output
# perl demo.pl >/dev/null -- mngr/child output
# perl demo.pl 2>/dev/null -- show results only
#
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
use strict;
use warnings;
use AnyEvent;
use AnyEvent::HTTP;
use Time::HiRes qw( time );
use MCE::Child;
use MCE::Shared;
# Construct two queues, input and return.
my $que = MCE::Shared->queue();
my $ret = MCE::Shared->queue();
# Construct shared handles for serializing output from many workers
# writing simultaneously. This prevents garbled output.
mce_open my $OUT, ">>", \*STDOUT or die "open error: $!";
mce_open my $ERR, ">>", \*STDERR or die "open error: $!";
# Spawn workers early for minimum memory consumption.
MCE::Child->create({ posix_exit => 1 }, 'task', $_) for 1 .. 4;
# Obtain or generate input data for workers to process.
my ( $count, @urls ) = ( 0 );
push @urls, map { "http://127.0.0.$_/" } 1..254;
push @urls, map { "http://192.168.0.$_/" } 1..254; # 508 URLs total
while ( @urls ) {
my @chunk = splice(@urls, 0, 64);
$que->enqueue( { ID => ++$count, INPUT => \@chunk } );
}
# So that workers leave the loop after consuming the queue.
$que->end();
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Loop for the manager process. The manager may do other work if
# need be and periodically check $ret->pending() not shown here.
#
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
my $start = time;
printf {$ERR} "Mngr - entering loop\n";
while ( $count ) {
my ( $result, $failed ) = $ret->dequeue( 2 );
# Remove ID from result, so not treated as a URL item.
printf {$ERR} "Mngr - received job %s\n", delete $result->{ID};
# Display the URL and the size captured.
foreach my $url ( keys %{ $result } ) {
printf {$OUT} "%s: %d\n", $url, length($result->{$url})
if $result->{$url}; # url has content
}
# Display URLs could not reach.
if ( @{ $failed } ) {
foreach my $url ( @{ $failed } ) {
print {$OUT} "Failed: $url\n";
}
}
# Decrement the count.
$count--;
}
MCE::Child->wait_all();
printf {$ERR} "Mngr - exiting loop\n\n";
printf {$ERR} "Duration: %0.3f seconds\n\n", time - $start;
exit;
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Child processes enqueue two items ( $result and $failed ) per each
# job for the manager process. Likewise, the manager process dequeues
# two items above. Optionally, child processes may include the ID in
# the result.
#
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sub task {
my ( $id ) = @_;
printf {$ERR} "Child $id entering loop\n";
while ( my $job = $que->dequeue() ) {
my ( $result, $failed ) = ( { ID => $job->{ID} }, [ ] );
# Walk URLs, provide a hash and array refs for data.
printf {$ERR} "Child $id running job $job->{ID}\n";
walk( $job, $result, $failed );
# Send results to the manager process.
$ret->enqueue( $result, $failed );
}
printf {$ERR} "Child $id exiting loop\n";
}
sub walk {
my ( $job, $result, $failed ) = @_;
# Yielding is critical when running an event loop in parallel.
# Not doing so means that the app may reach contention points
# with the firewall and likely impose unnecessary hardship at
# the OS level. The idea here is not to have multiple workers
# initiate HTTP requests to a batch of URLs at the same time.
# Yielding behaves similarly like scatter to have the child
# process run solo for a fraction of time.
MCE::Child->yield( 0.03 );
my $cv = AnyEvent->condvar();
# Populate the hash ref for the URLs it could reach.
# Do not mix AnyEvent timeout with child timeout.
# Therefore, choose event timeout when available.
foreach my $url ( @{ $job->{INPUT} } ) {
$cv->begin();
http_get $url, timeout => 2, sub {
my ( $data, $headers ) = @_;
$result->{$url} = $data;
$cv->end();
};
}
$cv->recv();
# Populate the array ref for URLs it could not reach.
foreach my $url ( @{ $job->{INPUT} } ) {
push @{ $failed }, $url unless (exists $result->{ $url });
}
return;
}
__END__
$ perl demo.pl
Child 1 entering loop
Child 2 entering loop
Child 3 entering loop
Mngr - entering loop
Child 2 running job 2
Child 3 running job 3
Child 1 running job 1
Child 4 entering loop
Child 4 running job 4
Child 2 running job 5
Mngr - received job 2
Child 3 running job 6
Mngr - received job 3
Child 1 running job 7
Mngr - received job 1
Child 4 running job 8
Mngr - received job 4
http://192.168.0.1/: 3729
Child 2 exiting loop
Mngr - received job 5
Child 3 exiting loop
Mngr - received job 6
Child 1 exiting loop
Mngr - received job 7
Child 4 exiting loop
Mngr - received job 8
Mngr - exiting loop
Duration: 4.131 seconds
CROSS-PLATFORM TEMPLATE FOR BINARY EXECUTABLEMaking an executable is possible with the PAR::Packer module. On the Windows platform, threads, threads::shared, and exiting via threads are necessary for the binary to exit successfully. # https://metacpan.org/pod/PAR::Packer
# https://metacpan.org/pod/pp
#
# pp -o demo.exe demo.pl
# ./demo.exe
use strict;
use warnings;
use if $^O eq "MSWin32", "threads";
use if $^O eq "MSWin32", "threads::shared";
# Include minimum dependencies for MCE::Child.
# Add other modules required by your application here.
use Storable ();
use Time::HiRes ();
# use IO::FDPass (); # optional: for condvar, handle, queue
# use Sereal (); # optional: for faster serialization
use MCE::Child;
use MCE::Shared;
# For PAR to work on the Windows platform, one must include manually
# any shared modules used by the application.
# use MCE::Shared::Array; # if using MCE::Shared->array
# use MCE::Shared::Cache; # if using MCE::Shared->cache
# use MCE::Shared::Condvar; # if using MCE::Shared->condvar
# use MCE::Shared::Handle; # if using MCE::Shared->handle, mce_open
# use MCE::Shared::Hash; # if using MCE::Shared->hash
# use MCE::Shared::Minidb; # if using MCE::Shared->minidb
# use MCE::Shared::Ordhash; # if using MCE::Shared->ordhash
# use MCE::Shared::Queue; # if using MCE::Shared->queue
# use MCE::Shared::Scalar; # if using MCE::Shared->scalar
# Et cetera. Only load modules needed for your application.
use MCE::Shared::Sequence; # if using MCE::Shared->sequence
my $seq = MCE::Shared->sequence( 1, 9 );
sub task {
my ( $id ) = @_;
while ( defined ( my $num = $seq->next() ) ) {
print "$id: $num\n";
sleep 1;
}
}
sub main {
MCE::Child->new( \&task, $_ ) for 1 .. 3;
MCE::Child->wait_all();
}
# Main must run inside a thread on the Windows platform or workers
# will fail duing exiting, causing the exe to crash. The reason is
# that PAR or a dependency isn't multi-process safe.
( $^O eq "MSWin32" ) ? threads->create(\&main)->join() : main();
threads->exit(0) if $INC{"threads.pm"};
LIMITATIONMCE::Child emits an error when "is_joinable", "is_running", and "join" isn't called by the managed process, where the child was spawned. This is a limitation in MCE::Child only due to not involving a shared-manager process for IPC. This use-case is not typical. CREDITSThe inspiration for "MCE::Child" comes from wanting "threads"-like behavior for processes compatible with Perl 5.8. Both can run side-by-side including safe-use by MCE workers. Likewise, the documentation resembles "threads". The inspiration for "wait_all" and "wait_one" comes from the "Parallel::WorkUnit" module. SEE ALSO
INDEXMCE, MCE::Channel, MCE::Shared AUTHORMario E. Roy, <marioeroy AT gmail DOT com>
|