|
|
| |
List::Gen(3) |
User Contributed Perl Documentation |
List::Gen(3) |
List::Gen - provides functions for generating lists
this module provides higher order functions, list comprehensions, generators,
iterators, and other utility functions for working with lists. walk lists with
any step size you want, create lazy ranges and arrays with a map like syntax
that generate values on demand. there are several other hopefully useful
functions, and all functions from List::Util are available.
use List::Gen;
print "@$_\n" for every 5 => 1 .. 15;
# 1 2 3 4 5
# 6 7 8 9 10
# 11 12 13 14 15
print mapn {"$_[0]: $_[1]\n"} 2 => %myhash;
my $ints = <0..>;
my $squares = gen {$_**2} $ints;
say "@$squares[2 .. 6]"; # 4 9 16 25 36
$ints->zip('.', -$squares)->say(6); # 0-0 1-1 2-4 3-9 4-16 5-25
list(1, 2, 3)->gen('**2')->say; # 1 4 9
my $fib = ([0, 1] + iterate {fib($_, $_ + 1)->sum})->rec('fib');
my $fac = iterate {$_ < 2 or $_ * self($_ - 1)}->rec;
say "@$fib[0 .. 15]"; # 0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610
say "@$fac[0 .. 10]"; # 1 1 2 6 24 120 720 5040 40320 362880 3628800
say <0, 1, * + * ...>->take(10)->str; # 0 1 1 2 3 5 8 13 21 34
say <[..*] 1, 1..>->str(8); # 1 1 2 6 24 120 720 5040
<**2 for 1..10 if even>->say; # 4 16 36 64 100
<1..>->map('**2')->grep(qr/1/)->say(5); # 1 16 81 100 121
use List::Gen; # is the same as
use List::Gen qw/mapn by every range gen cap \ filter cache apply zip
min max reduce glob iterate list/;
the following export tags are available:
:utility mapn by every apply min max reduce mapab
mapkey d deref slide curse remove
:source range glob makegen list array vecgen repeat file
:modify gen cache expand contract collect slice flip overlay
test recursive sequence scan scan_stream == scanS
cartesian transpose stream strict
:zip zip zipgen tuples zipwith zipwithab unzip unzipn
zipmax zipgenmax zipwithmax
:iterate iterate
iterate_multi == iterateM
iterate_stream == iterateS
iterate_multi_stream == iterateMS
:gather gather
gather_stream == gatherS
gather_multi == gatherM
gather_multi_stream == gatherMS
:mutable mutable done done_if done_unless
:filter filter
filter_stream == filterS
filter_ # non-lookahead version
:while take_while == While
take_until == Until
while_ until_ # non-lookahead versions
drop_while drop_until
:numeric primes
:deprecated genzip
:List::Util first max maxstr min minstr reduce shuffle sum
use List::Gen '*'; # everything
use List::Gen 0; # everything
use List::Gen ':all'; # everything
use List::Gen ':base'; # same as 'use List::Gen;'
use List::Gen (); # no exports
- mapn " {CODE} NUM LIST "
- this function works like the builtin " map
" but takes " NUM " sized
steps over the list, rather than one element at a time. inside the
" CODE " block, the current slice is in
@_ and $_ is set to
$_[0] . slice elements are aliases to the
original list. if " mapn " is called in
void context, the " CODE " block will be
executed in void context for efficiency.
print mapn {$_ % 2 ? "@_" : " [@_] "} 3 => 1..20;
# 1 2 3 [4 5 6] 7 8 9 [10 11 12] 13 14 15 [16 17 18] 19 20
print "student grades: \n";
mapn {
print shift, ": ", &sum / @_, "\n";
} 5 => qw {
bob 90 80 65 85
alice 75 95 70 100
eve 80 90 80 75
};
- by " NUM LIST "
- every " NUM LIST "
- " by " and " every
" are exactly the same, and allow you to add variable step
size to any other list control structure with whichever reads better to
you.
for (every 2 => @_) {do something with pairs in @$_}
grep {do something with triples in @$_} by 3 => @list;
the functions generate an array of array references to
" NUM " sized slices of
" LIST ". the elements in each slice
are aliases to the original list.
in list context, returns a real array. in scalar context,
returns a generator.
my @slices = every 2 => 1 .. 10; # real array
my $slices = every 2 => 1 .. 10; # generator
for (every 2 => 1 .. 10) { ... } # real array
for (@{every 2 => 1 .. 10}) { ... } # generator
if you plan to use all the slices, the real array is probably
better. if you only need a few, the generator won't need to compute all
of the other slices.
print "@$_\n" for every 3 => 1..9;
# 1 2 3
# 4 5 6
# 7 8 9
my @a = 1 .. 10;
for (every 2 => @a) {
@$_[0, 1] = @$_[1, 0] # flip each pair
}
print "@a";
# 2 1 4 3 6 5 8 7 10 9
print "@$_\n" for grep {$$_[0] % 2} by 3 => 1 .. 9;
# 1 2 3
# 7 8 9
- apply " {CODE} LIST "
- apply a function that modifies $_ to a shallow
copy of " LIST " and returns the copy
print join ", " => apply {s/$/ one/} "this", "and that";
> this one, and that one
- zip " LIST "
- " zip " takes a list of array references
and generators. it interleaves the elements of the passed in sequences to
create a new list. " zip " continues
until the end of the shortest sequence. " LIST
" can be any combination of array references and generators.
%hash = zip [qw/a b c/], [1..3]; # same as
%hash = (a => 1, b => 2, c => 3);
in scalar context, " zip "
returns a generator, produced by " zipgen
"
if the first argument to " zip
" is not an array or generator, it is assumed to be code or
a code like string. that code will be used to join the elements from the
remaining arguments.
my $gen = zip sub {$_[0] . $_[1]}, [1..5], <a..>;
# or = zip '.' => [1..5], <a..>;
# or = zipwith {$_[0] . $_[1]} [1..5], <a..>;
$gen->str; # '1a 2b 3c 4d 5e'
- zipmax " LIST "
- interleaves the passed in lists to create a new list.
" zipmax " continues until the end of
the longest list, " undef " is returned
for missing elements of shorter lists. " LIST
" can be any combination of array references and generators.
%hash = zipmax [qw/a b c d/], [1..3]; # same as
%hash = (a => 1, b => 2, c => 3, d => undef);
in scalar context, " zipmax
" returns a generator, produced by "
zipgenmax "
" zipmax " provides the same
functionality as " zip " did in
versions before 0.90
- tuples " LIST "
- interleaves the passed in lists to create a new list of arrays.
" tuples " continues until the end of
the shortest list. " LIST " can be any
combination of array references and generators.
@list = tuples [qw/a b c/], [1..3]; # same as
@list = ([a => 1], [b => 2], [c => 3]);
in scalar context, " tuples
" returns a generator:
tuples(...) ~~ zipwith {\@_} ...
- cap " LIST "
- " cap " captures a list, it is exactly
the same as "sub{\@_}->(LIST)"
note that this method of constructing an array ref from a list
is roughly 40% faster than " [ LIST
]", but with the caveat and feature that elements are
aliases to the original list
- " &\(LIST) "
- a synonym for " cap ", the symbols
" &\(...) " will perform the same
action. it could be read as taking the subroutine style reference of a
list. like all symbol variables, once imported, "
&\ " is global across all packages.
my $capture = & \(my $x, my $y); # a space between & and \ is fine
# and it looks a bit more syntactic
($x, $y) = (1, 2);
say "@$capture"; # 1 2
in this document, a generator is an object similar to an
array that generates its elements on demand. generators can be used as
iterators in perl's list control structures such as "
for/foreach " and " while ".
generators, like programmers, are lazy. unless they have to, they will not
calculate or store anything. this laziness allows infinite generators to be
created. you can choose to explicitly cache a generator, and several
generators have implicit caches for efficiency.
there are source generators, which can be numeric ranges, arrays,
or iterative subroutines. these can then be modified by wrapping each
element with a subroutine, filtering elements, or combining generators with
other generators. all of this behavior is lazy, only resolving generator
elements at the latest possible time.
all generator functions return a blessed and overloaded reference
to a tied array. this may sound a bit magical, but it just means that you
can access the generator in a variety of ways, all which remain lazy.
given the generator:
my $gen = gen {$_**2} range 0, 100;
or gen {$_**2} 0, 100;
or range(0, 100)->map(sub {$_**2});
or <0..100>->map('**2');
or <**2 for 0..100>;
which describes the sequence of " n**2 for n
from 0 to 100 by 1 ":
0 1 4 9 16 25 ... 9604 9801 10000
the following lines are equivalent (each prints
'25'):
say $gen->get(5);
say $gen->(5);
say $gen->[5];
say $gen->drop(5)->head;
say $gen->('5..')->head;
as are these (each printing '25 36 49 64 81
100'):
say "@$gen[5 .. 10]";
say join ' ' => $gen->slice(5 .. 10);
say join ' ' => $gen->(5 .. 10);
say join ' ' => @$gen[5 .. 10];
say $gen->slice(range 5 => 10)->str;
say $gen->drop(5)->take(6)->str;
say $gen->(<5..10>)->str;
say $gen->('5..10')->str;
generators as arrays
you can access generators as if they were array
references. only the requested indicies will be generated.
my $range = range 0, 1_000_000, 0.2;
# will produce 0, 0.2, 0.4, ... 1000000
say "@$range[10 .. 15]"; # calculates 6 values: 2 2.2 2.4 2.6 2.8 3
my $gen = gen {$_**2} $range; # attaches a generator function to a range
say "@$gen[10 .. 15]"; # '4 4.84 5.76 6.76 7.84 9'
for (@$gen) {
last if $_ > some_condition;
# the iteration of this loop is lazy, so when exited
# with `last`, no extra values are generated
...
}
generators in loops
evaluation in each of these looping examples remains
lazy. using " last " to escape from the loop
early will result in some values never being generated.
... for @$gen;
for my $x (@$gen) {...}
... while <$gen>;
while (my ($next) = $gen->()) {...}
there are also looping methods, which take a subroutine. calling
" last " from the subroutine works the
same as in the examples above.
$gen->do(sub {...}); or ->each
For {$gen} sub {
... # indirect object syntax
};
there is also a user space subroutine named
&last that is installed into the calling namespace during the
execution of the loop. calling it without arguments has the same function as
the builtin " last ". calling it with an
argument will still end the looping construct, but will also cause the loop
to return the argument. the " done ... "
exception also works the same way as " &last(...)
"
my $first = $gen->do(sub {&last($_) if /something/});
# same as: $gen->first(qr/something/);
you can use generators as file handle iterators:
local $_;
while (<$gen>) { # calls $gen->next internally
# do something with $_
}
generators as objects
all generators have the following methods by default
- iteration:
$gen->next # iterates over generator ~~ $gen->get($gen->index++)
$gen->() # same. iterators return () when past the end
$gen->more # test if $gen->index not past end
$gen->reset # reset iterator to start
$gen->reset(4) # $gen->next returns $$gen[4]
$gen->index # fetches the current position
$gen->index = 4 # same as $gen->reset(4)
$gen->nxt # next until defined
$gen->iterator # returns the $gen->next coderef iterator
- indexing:
$gen->get(index) # returns $$gen[index]
$gen->(index) # same
$gen->slice(4 .. 12) # returns @$gen[4 .. 12]
$gen->(4 .. 12) # same
$gen->size # returns 'scalar @$gen'
$gen->all # same as list context '@$gen' but faster
$gen->list # same as $gen->all
- printing:
$gen->join(' ') # join ' ', $gen->all
$gen->str # join $", $gen->all (recursive with nested generators)
$gen->str(10) # limits generators to 10 elements
$gen->perl # serializes the generator in array syntax (recursive)
$gen->perl(9) # limits generators to 9 elements
$gen->perl(9, '...') # prints ... at the end of each truncated generator
$gen->print(...); # print $gen->str(...)
$gen->say(...); # print $gen->str(...), $/
$gen->say(*FH, ...) # print FH $gen->str(...), $/
$gen->dump(...) # print $gen->perl(...), $/
$gen->debug # carps debugging information
$gen->watch(...) # prints ..., value, $/ each time a value is requested
- eager looping:
$gen->do(sub {...}) # for (@$gen) {...} # but faster
$gen->each(sub{...}) # same
- slicing:
$gen->head # $gen->get(0)
$gen->tail # $gen->slice(<1..>) # lazy slices
$gen->drop(2) # $gen->slice(<2..>)
$gen->take(4) # $gen->slice(<0..3>)
$gen->x_xs # ($gen->head, $gen->tail)
- accessors:
$gen->range # range(0, $gen->size - 1)
$gen->keys # same as $gen->range, but a list in list context
$gen->values # same as $gen, but a list in list context
$gen->kv # zip($gen->range, $gen)
$gen->pairs # same as ->kv, but each pair is a tuple (array ref)
- randomization:
$gen->pick # return a random element from $gen
$gen->pick(n) # return n random elements from $gen
$gen->roll # same as pick
$gen->roll(n) # pick and replace
$gen->shuffle # a lazy shuffled generator
$gen->random # an infinite generator that returns random elements
- searching:
$gen->first(sub {$_ > 5}) # first {$_ > 5} $gen->all # but faster
$gen->first('>5') # same
$gen->last(...) # $gen->reverse->first(...)
$gen->first_idx(...) # same as first, but returns the index
$gen->last_idx(...)
- sorting:
$gen->sort # sort $gen->all
$gen->sort(sub {$a <=> $b}) # sort {$a <=> $b} $gen->all
$gen->sort('<=>') # same
$gen->sort('uc', 'cmp') # does: map {$$_[0]}
# sort {$$a[1] cmp $$b[1]}
# map {[$_ => uc]} $gen->all
- reductions:
$gen->reduce(sub {$a + $b}) # reduce {$a + $b} $gen->all
$gen->reduce('+') # same
$gen->sum # $gen->reduce('+')
$gen->product # $gen->reduce('*')
$gen->scan('+') # [$$gen[0], sum(@$gen[0..1]), sum(@$gen[0..2]), ...]
$gen->min # min $gen->all
$gen->max # max $gen->all
- transforms:
$gen->cycle # infinite repetition of a generator
$gen->rotate(1) # [$gen[1], $gen[2] ... $gen[-1], $gen[0]]
$gen->rotate(-1) # [$gen[-1], $gen[0], $gen[1] ... $gen[-2]]
$gen->uniq # $gen->filter(do {my %seen; sub {not $seen{$_}++}})
$gen->deref # tuples($a, $b)->deref ~~ zip($a, $b)
- combinations:
$gen->zip($gen2, ...) # takes any number of generators or array refs
$gen->cross($gen2) # cross product
$gen->cross2d($gen2) # returns a 2D generator containing the same
# elements as the flat ->cross generator
$gen->tuples($gen2) # tuples($gen, $gen2)
the " zip " and the
" cross " methods all use the comma
operator ( ',' ) by default to join their
arguments. if the first argument to any of these methods is code or a
code like string, that will be used to join the arguments. more detail
in the overloaded operators section below
$gen->zip(',' => $gen2) # same as $gen->zip($gen2)
$gen->zip('.' => $gen2) # $gen[0].$gen2[0], $gen[1].$gen2[1], ...
- introspection:
$gen->type # returns the package name of the generator
$gen->is_mutable # can the generator change size?
- utility:
$gen->apply # causes a mutable generator to determine its true size
$gen->clone # copy a generator, resets the index
$gen->copy # copy a generator, preserves the index
$gen->purge # purge any caches in the source chain
- traversal:
$gen->leaves # returns a coderef iterator that will perform a depth first
# traversal of the edge nodes in a tree of nested generators.
# a full run of the iterator will ->reset all of the internal
# generators
- while:
$gen->while(...) # While {...} $gen
$gen->take_while(...) # same
$gen->drop_while(...) # $gen->drop( $gen->first_idx(sub {...}) )
$gen->span # collects $gen->next calls until one
# returns undef, then returns the collection.
# ->span starts from and moves the ->index
$gen->span(sub{...}) # span with an argument splits the list when the code
# returns false, it is equivalent to but more efficient
# than ($gen->take_while(...), $gen->drop_while(...))
$gen->break(...) # $gen->span(sub {not ...})
- tied vs methods:
the methods duplicate and extend the tied functionality and
are necessary when working with indices outside of perl's array limit
" (0 .. 2**31 - 1) " or when fetching
a list return value (perl clamps the return to a scalar with the array
syntax). in all cases, they are also faster than the tied interface.
- functions as methods:
most of the functions in this package are also methods of
generators, including by, every, mapn, gen, map (alias of gen), filter,
grep (alias of filter), test, cache, flip, reverse (alias of flip),
expand, collect, overlay, mutable, while, until, recursive, rec (alias
of recursive).
my $gen = (range 0, 1_000_000)->gen(sub{$_**2})->filter(sub{$_ % 2});
#same as: filter {$_ % 2} gen {$_**2} 0, 1_000_000;
- dwim code:
when a method takes a code ref, that code ref can be specified
as a string containing an operator and an optional curried argument (on
either side)
my $gen = <0 .. 1_000_000>->map('**2')->grep('%2'); # same as above
you can prefix " ! " or
" not " to negate the operator:
my $even = <1..>->grep('!%2'); # sub {not $_ % 2}
you can even use a typeglob to specify an operator when the
method expects a binary subroutine:
say <1 .. 10>->reduce(*+); # 55 # and saves a character over '+'
or a regex ref:
<1..30>->grep(qr/3/)->say; # 3 13 23 30
you can flip the arguments to a binary operator by prefixing
it with " R " or by applying the
" ~ " operator to it:
say <a..d>->reduce('R.'); # 'dcba' # lowercase r works too
say <a..d>->reduce(~'.'); # 'dcba'
say <a..d>->reduce(~*.); # 'dcba'
- methods without return values:
the methods that do not have a useful return value, such as
"->say", return the same generator
they were called with. this lets you easily insert these methods at any
point in a method chain for debugging.
predicates
several predicates are available to use with the
filtering methods:
<1..>->grep('even' )->say(5); # 2 4 6 8 10
<1..>->grep('odd' )->say(5); # 1 3 5 7 9
<1..>->grep('prime')->say(5); # 2 3 5 7 11
<1.. if prime>->say(5); # 2 3 5 7 11
others are: defined, true, false
lazy slices
if you call the " slice
" method with a " range " or
other numeric generator as its argument, the method will return a generator
that will perform the slice
my $gen = gen {$_ ** 2};
my $slice = $gen->slice(range 100 => 1000); # nothing calculated
say "@$slice[5 .. 10]"; # 6 values calculated
or using the glob syntax:
my $slice = $gen->slice(<100 .. 1000>);
infinite slices are fine:
my $tail = $gen->slice(<1..>);
lazy slices also work with the dwim code-deref syntax:
my $tail = $gen->(<1..>);
stacked continuous lazy slices collapse into a single composite
slice for efficiency
my $slice = $gen->(<1..>)->(<1..>)->(<1..>);
$slice == $gen->(<3..>);
if you choose not to import the " glob
" function, you can still write ranges succinctly as strings,
when used as arguments to slice:
my $tail = $gen->('1..');
my $tail = $gen->slice('1..');
dwim code dereference
when dereferenced as code, a generator decides what do do
based on the arguments it is passed.
$gen->() ~~ $gen->next
$gen->(1) ~~ $gen->get(1) or $$gen[1]
$gen->(1, 2, ...) ~~ $gen->slice(1, 2, ...) or @$gen[1, 2, ...]
$gen->(<1..>) ~~ $gen->slice(<1..>) or $gen->tail
if passed a code ref or regex ref,
"->map" will be called with the
argument, if passed a reference to a code ref or regex ref,
"->grep" will be called.
my $pow2 = <0..>->(sub {$_**2}); # calls ->map(sub{...})
my $uc = $gen->(\qr/[A-Z]/); # calls ->grep(qr/.../)
you can lexically enable code coercion from strings
(experimental):
local $List::Gen::DWIM_CODE_STRINGS = 1;
my $gen = <0 .. 1_000_000>->('**2')(\'%2');
^map ^grep
due to some scoping issues, if you want to install this dwim
coderef into a subroutine, the reliable way is to call the
"->code" method:
*fib = <0, 1, *+*...>->code; # rather than *fib = \&{<0, 1, *+*...>}
overloaded operators
to make the usage of generators a bit more syntactic the
following operators are overridden:
$gen1 x $gen2 ~~ $gen1->cross($gen2)
$gen1 x'.'x $gen2 ~~ $gen1->cross('.', $gen2)
or $gen1->cross(sub {$_[0].$_[1]}, $gen2)
$gen1 x sub{$_[0].$_[1]} x $gen2 # same as above
$gen1 + $gen2 ~~ sequence $gen1, $gen2
$g1 + $g2 + $g3 ~~ sequence $g1, $g2, $g3 # or more
$gen1 | $gen2 ~~ $gen1->zip($gen2)
$gen1 |'+'| $gen2 ~~ $gen1->zip('+', $gen2)
or $gen1->zip(sub {$_[0] + $_[1]}, $gen2)
$gen1 |sub{$_[0]+$_[1]}| $gen2 # same as above
$x | $y | $z ~~ $x->zip($y, $z)
$w | $x | $y | $z ~~ $w->zip($x, $y, $z) # or more
if the first argument to a
"->zip" or
"->cross" method is not an array or
generator, it is assumed to be a subroutine and the corresponding
"->(zip|cross)with" method is
called:
$gen1->zipwith('+', $gen2) ~~ $gen1->zip('+', $gen2);
hyper operators:
not quite as elegant as perl6's hyper operators, but the same
idea. these are similar to " zipwith " but
with more control over the length of the returned generator. all of perl's
non-mutating binary operators are available to use as strings, or you can
use a subroutine.
$gen1 <<'.'>> $gen2 # longest list
$gen1 >>'+'<< $gen2 # equal length lists or error
$gen1 >>'-'>> $gen2 # length of $gen2
$gen1 <<'=='<< $gen2 # length of $gen1
$gen1 <<sub{...}>> $gen2
$gen1 <<\&some_sub>> $gen2
my $x = <1..> <<'.'>> 'x';
$x->say(5); # '1x 2x 3x 4x 5x'
in the last example, a bare string is the final element, and
precedence rules keep everything working. however, if you want to use a non
generator as the first element, a few parens are needed to force the
evaluation properly:
my $y = 'y' <<('.'>> <1..>);
$y->say(5); # 'y1 y2 y3 y4 y5'
otherwise 'y' << '.' will run first
without overloading, which will be an error. since that is a bit awkward,
where you can specify an operator string, you can prefix
" R " or " r
" to indicate that the arguments to the operator should be
reversed.
my $y = <1..> <<'R.'>> 'y';
$y->say(5); # 'y1 y2 y3 y4 y5'
just like in perl6, hyper operators are recursively defined for
multi dimensional generators.
say +(list(<1..>, <2..>, <3..>) >>'*'>> -1)->perl(4, '...')
# [[-1, -2, -3, -4, ...], [-2, -3, -4, -5, ...], [-3, -4, -5, -6, ...]]
hyper operators currently do not work with mutable generators.
this will be addressed in a future update.
you can also specify the operator in a hyper-operator as a
typeglob:
my $xs = <1..> >>*.>> 'x'; # *. is equivalent to '.'
$xs->say(5); # 1x 2x 3x 4x 5x
my $negs = <0..> >>*-; # same as: <0..> >>'-'
$negs->say(5); # 0 -1 -2 -3 -4
hyper also works as a method:
<1..>->hyper('<<.>>', 'x')->say(5); # '1x 2x 3x 4x 5x'
# defaults to '<<...>>'
<1..>->hyper('.', 'x')->say(5); # '1x 2x 3x 4x 5x'
hyper negation can be done directly with the prefix minus
operator:
-$gen ~~ $gen >>'-' ~~ $gen->hyper('-')
mutable generators
mutable generators (those returned from mutable, filter,
While, Until, and iterate_multi) are generators with variable length. in
addition to all normal methods, mutable generators have the following methods:
$gen->when_done(sub {...}) # schedule a method to be called when the
# generator is exhausted
# when_done can be called multiple times to
# schedule multiple end actions
$gen->apply; # causes the generator to evaluate all of its elements in
# order to find out its true size. it is a bad idea to call
# ->apply on an infinite generator
due to the way perl processes list operations, when perl sees an
expression like:
print "@$gen\n"; # or
print join ' ' => @$gen;
it calls the internal " FETCHSIZE
" method only once, before it starts getting elements from the
array. this is fine for immutable generators. however, since mutable
generators do not know their true size, perl will think the array is bigger
than it really is, and will most likely run off the end of the list,
returning many undefined elements, or throwing an exception.
the solution to this is to call
"$gen->apply" first, or to use the
"$gen->all" method with mutable
generators instead of @$gen , since the
"->all" method understands how to deal
with arrays that can change size while being read.
perl's " for/foreach " loop is a
bit smarter, so just like immutable generators, the mutable ones can be
dereferenced as the loop argument with no problem:
... foreach @$mutable_generator; # works fine
stream generators
the generators
"filter",
"scan", and
"iterate" (all of its flavors) have internal
caches that allow random access within the generator. some algorithms only
need monotonically increasing access to the generator (all access via repeated
calls to "$gen->next" for example), and
the cache could become a performance/memory problem.
the *_stream family of generators do not
maintain an internal cache, and are subsequently unable to fulfill requests
for indicies lower than or equal to the last accessed index. they will
however be faster and use less memory than their non-stream counterparts
when monotonically increasing access is all that an algorithm needs.
stream generators can be thought of as traditional subroutine
iterators that also have generator methods. it is up to you to ensure that
all operations and methods follow the monotonically increasing index rule.
you can determine the current position of the stream iterator with the
"$gen->index" method.
my $nums = iterate_stream{2*$_}->from(1);
say $nums->(); # 1
say $nums->(); # 2
say $nums->(); # 4
say $nums->index; # 3
say $nums->drop( $nums->index )->str(5); # '8 16 32 64 128'
say $nums->index; # 8
the "$gen->drop( $gen->index
)->method" pattern can be shortened to
"$gen->idx->method"
say $nums->idx->str(5); # '256 512 1024 2048 4096'
the "$gen->index" method of
stream generators is read only. calling
"$gen->reset" on a stream generator
will throw an error.
stream generators are experimental and may change in future
versions.
threads
generators have the following multithreaded methods:
$gen->threads_blocksize(3) # sets size to divide work into
$gen->threads_cached; # implements a threads::shared cache
$gen->threads_cached(10) # as normal, then calls threads_start with arg
$gen->threads_start; # creates 4 worker threads
$gen->threads_start(2); # or however many you want
# if you don't call it, threads_slice will
my @list = $gen->threads_slice(0 .. 1000); # sends work to the threads
my @list = $gen->threads_all;
$gen->threads_stop; # or let the generator fall out of scope
all threads are local to a particular generator, they are not
shared. if the passed in generator was cached (at the top level) that cache
is shared and used automatically. this includes most generators with
implicit caches. threads_slice and threads_all can be called without
starting the threads explicitly. in that case, they will start with default
values.
the threaded methods only work in perl versions 5.10.1 to 5.12.x,
patches to support other versions are welcome.
- range " SIZE "
- returns a generator from 0 to
" SIZE - 1 "
my $range = range 10;
say $range->str; # 0 1 2 3 4 5 6 7 8 9
say $range->size; # 10
- range " START STOP [STEP] "
- returns a generator for values from " START
" to " STOP " by
" STEP ", inclusive.
" STEP " defaults to 1 but
can be fractional and negative. depending on your choice of
" STEP ", the last value returned may
not always be " STOP ".
range(0, 3, 0.4) will return (0, 0.4, 0.8, 1.2, 1.6, 2, 2.4, 2.8)
print "$_ " for @{range 0, 1, 0.1};
# 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
print "$_ " for @{range 5, 0, -1};
# 5 4 3 2 1 0
my $nums = range 0, 1_000_000, 2;
print "@$nums[10, 100, 1000]";
# gets the tenth, hundredth, and thousandth numbers in the range
# without calculating any other values
"range" also accepts
character strings instead of numbers. it will behave the same way as
perl's internal " .. " operator,
except it will be lazy.
say range('a', 'z')->str; # 'a b c d e f g ... x y z'
range('a', 'zzz', 2)->say; # 'a c e g i k m ... zzu zzw zzy'
say <A .. ZZ>->str; # 'A B C D E ... ZX ZY ZZ'
<1..>->zip(<a..>)->say(10); # '1 a 2 b 3 c 4 d 5 e'
to specify an infinite range, you can pass
" range " an infinite value
(" 9**9**9 " works well), or the glob
" ** ", or the string
'*'
range(1, 9**9**9) ~~ range(1, **) ~~ range(1, '*') ~~ <1..*> ~~ <1..>
ranges only store their endpoints, and ranges of all sizes
take up the same amount of memory.
- gen " {CODE} GENERATOR "
- gen " {CODE} ARRAYREF "
- gen " {CODE} SIZE "
- gen " {CODE} [START STOP [STEP]] "
- gen " {CODE} GLOBSTRING "
- " gen " is the equivalent of
" map " for generators. it returns a
generator that will apply the " CODE "
block to its source when accessed. " gen
" takes a generator, array ref, glob-string, or suitable
arguments for " range " as its source.
with no arguments, " gen " uses the
range " 0 .. infinity ".
my @result = map {slow($_)} @source; # slow() called @source times
my $result = gen {slow($_)} \@source; # slow() not called
my ($x, $y) = @$result[4, 7]; # slow() called twice
my $lazy = gen {slow($_)} range 1, 1_000_000_000;
same: gen {slow($_)} 1, 1_000_000_000;
print $$lazy[1_000_000]; # slow() only called once
" gen {...} list LIST " is a
replacement for " [ map {...} LIST ]
".
" gen " provides the
functionality of the identical
"->gen(...)" and
"->map(...)" methods.
note that while effort has gone into making generators as fast
as possible there is overhead involved with lazy generation. simply
replacing all calls to " map " with
" gen " will almost certainly slow
down your code. use these functions in situations where the time /
memory required to completely generate the list is unacceptable.
" gen " and other similarly
argumented functions in this package can also accept a string suitable
for the "<glob>" syntax:
my $square_of_nats = gen {$_**2} '1..';
my $square_of_fibs = gen {$_**2} '0, 1, *+*'; # no need for '...' with '*'
which is the same as the following if "
glob " is imported:
my $square_of_nats = gen {$_**2} <1..>;
my $square_of_fibs = gen {$_**2} <0, 1, *+* ...>; # still need dots here
- makegen " ARRAY "
- " makegen " converts an array to a
generator. this is normally not needed as most generator functions will
call it automatically if passed an array reference
" makegen " considers the
length of " ARRAY " to be immutable.
changing the length of an array after passing it to
" makegen " (or to
" gen " and like argumented
subroutines) will result in undefined behavior. this is done for
performance reasons. if you need a length mutable array, use the
" array " function. changing the value
of a cell in the array is fine, and will be picked up by a generator (of
course if the generator uses a cache, the value won't change after being
cached).
you can assign to the generator returned by
" makegen ", provided the assignment
does not lengthen the array.
my $gen = makegen @array;
$$gen[3] = 'some value'; # now $array[3] is 'some value'
- list " LIST "
- " list " converts a list to a generator.
it is a thin wrapper around " makegen "
that simply passes its @_ to
" makegen ". that means the values in
the returned generator are aliases to
"list"'s arguments.
list(2, 5, 8, 11)->map('*2')->say; # '4 10 16 22'
is the same as writing:
(gen {$_*2} cap 2, 5, 8, 11)->say;
in the above example, " list
" can be used in place of " cap
" and has exactly the same functionality:
(gen {$_*2} list 2, 5, 8, 11)->say;
- array " [ARRAY] "
- " array " is similar to
" makegen " except the array is
considered a mutable data source. because of this, certain optimizations
are not possible, and the generator returned will be a bit slower than the
one created by " makegen " in most
conditions (increasing as generator functions are stacked).
it is ok to modify " ARRAY "
after creating the generator. it is also possible to use normal array
modification functions such as " push
", " pop ",
" shift ", "
unshift ", and " splice "
on the generator. all changes will translate back to the source
array.
you can think of " array "
as converting an array to an array reference that is also a
generator.
my @src = 1..5;
my $gen = array @src;
push @$gen, 6;
$$gen[6] = 7; # assignment is ok too
say $gen->size; # 7
say shift @$gen; # 1
say $gen->size; # 6
say $gen->str; # 2 3 4 5 6 7
say "@src"; # 2 3 4 5 6 7
my $array = array; # no args creates an empty array
- file "FILE [OPTIONS]"
- " file " creates an
" array " generator from a file name or
file handle using " Tie::File ".
" OPTIONS " are passed to
" Tie::File "
my $gen = file 'some_file.txt';
my $uc_file = $gen->map('uc');
my $with_line_numbers = <1..>->zip('"$a: $b"', $gen);
- repeat "SCALAR [SIZE]"
- an infinite generator that returns
"SCALAR" for every position. it is
equivalent to " gen {SCALAR} " but a
little faster.
- iterate " {CODE} [LIMIT|GENERATOR] "
- " iterate " returns a generator that is
created iteratively. " iterate "
implicitly caches its values, this allows random access normally not
possible with an iterative algorithm. LIMIT is an optional number of times
to iterate. normally, inside the CODE block, $_ is
set to the current iteration number. if passed a generator instead of a
limit, $_ will be set to sequential values from
that generator.
my $fib = do {
my ($x, $y) = (0, 1);
iterate {
my $return = $x;
($x, $y) = ($y, $x + $y);
$return
}
};
generators produced by " iterate
" have an extra method,
"->from(LIST)". the method must be
called before values are accessed from the generator. the passed
" LIST " will be the first values
returned by the generator. the method also changes the behavior of
$_ inside the block. $_
will contain the previous value generated by the iterator. this
allows " iterate " to behave the same
way as the like named haskell function.
haskell: take 10 (iterate (2*) 1)
perl: iterate{2*$_}->from(1)->take(10)
<1, 2 * * ... 10>
<1,2**...10>
which all return " [1, 2, 4, 8, 16, 32,
64, 128, 256, 512] "
- iterate_stream " {CODE} [LIMIT] "
- " iterate_stream " is a version of
" iterate " that does not cache the
generated values. because of this, access to the returned generator must
be monotonically increasing (such as repeated calls to
"$gen->next").
- iterate_multi " {CODE} [LIMIT] "
- the same as "iterate", except CODE can
return a list of any size. inside CODE, $_ is set
to the position in the returned generator where the block's returned list
will be placed.
the returned generator from "
iterate_multi " can be modified with
"push",
"pop",
"shift",
"unshift", and
"splice" like a normal array. it is up
to you to ensure that the iterative algorithm will still work after
modifying the array.
the "->from(...)" method
can be called on the returned generator. see "
iterate " for the rules and effects of this.
- iterate_multi_stream " {CODE} [LIMIT] "
- " iterate_multi_stream " is a version of
" iterate_multi " that does not cache
the generated values. because of this, access to the returned generator
must be monotonically increasing (such as repeated calls to
"$gen->next").
keyword modification of a stream iterator (with
"push",
"shift", ...) is not supported.
- gather " {CODE} [LIMIT] "
- " gather " returns a generator that is
created iteratively. rather than returning a value, you call
" take($return_value) " within the
" CODE " block. note that since perl5
does not have continuations, " take(...)
" does not pause execution of the block. rather, it stores the
return value, the block finishes, and then the generator returns the
stored value.
you can not import the " take(...)
" function from this module. "
take(...) " will be installed automatically into your
namespace during the execution of the " CODE
" block. because of this, you must always call
" take(...) " with parenthesis.
" take " returns its argument
unchanged.
gather implicitly caches its values, this allows random access
normally not possible with an iterative algorithm. the algorithm in
" iterate " is a bit cleaner here, but
" gather " is slower than
" iterate ", so benchmark if speed is
a concern
my $fib = do {
my ($x, $y) = (0, 1);
gather {
($x, $y) = ($y, take($x) + $y)
}
};
a non-cached version " gather_stream
" is also available, see "
iterate_stream "
- gather_multi " {CODE} [LIMIT] "
- the same as " gather " except you can
" take(...) " multiple times, and each
can take a list. " gather_multi_stream "
is also available.
- stream " {CODE} "
- in the " CODE " block, calls to
functions or methods with stream versions will be replaced by those
versions. this applies also to functions that are called internally by
" List::Gen " (such as in the glob
syntax). " stream " returns what
" CODE " returns.
say iterate{}->type; # List::Gen::Iterate
say iterate_stream{}->type; # List::Gen::Iterate_Stream
stream {
say iterate{}->type; # List::Gen::Iterate_Stream
};
say stream{iterate{}}->type; # List::Gen::Iterate_Stream
say stream{<1.. if even>}->type; # List::Gen::Filter_Stream
placing code inside a " stream
" block is exactly the same as placing
" local $List::Gen::STREAM = 1; " at
the top of a block.
- glob " STRING "
- <list comprehension>
- by default, this module overrides perl's default "
glob " function. this is because the "
glob " function provides the behavior of the angle bracket
delimited "<*.ext>" operator,
which is a nice place for inserting list comprehensions into perl's
syntax. the override causes " glob() "
and the "<*.ext>" operator to have
a few special cases overridden, but any case that is not overridden will
be passed to perl's internal " glob "
function ("my @files = <*.txt>;"
works as normal).
- there are several types of overridden operations:
range: < [prefix,] low .. [high] [by step] >
iterate: < [prefix,] code ... [size] >
list comprehension: < [code for] (range|iterate) [if code] [while code] >
reduction: < \[op|name\] (range|iterate|list comprehension) >
- range strings match the following pattern:
(prefix,)? number .. number? ((by | += | -= | [-+]) number)?
here are a few examples of valid ranges:
<1 .. 10> ~~ range 1, 10
<0 .. > ~~ range 0, 9**9**9
<0 .. *> ~~ range 0, 9**9**9
<1 .. 10 by 2> ~~ range 1, 10, 2
<10 .. 1 -= 2> ~~ range 10, 1, -2
<a .. z> ~~ range 'a', 'z'
<A .. ZZ> ~~ range 'A', 'ZZ'
<a..> ~~ range 'a', 9**9**9
<a.. += b> ~~ range 'a', 9**9**9, 2
<0, 0..> ~~ [0] + range 0, 9**9**9
<'a','ab', 0..> ~~ ['a','ab'] + range 0, 9**9**9
<qw(a ab), 0..> ~~ [qw(a ab)] + range 0, 9**9**9
- iterate strings match the following pattern:
(.+? ,)+ (.*[*].* | \{ .+ }) ... number?
such as:
my $fib = <0, 1, * + * ... *>;
which means something like:
my $fib = do {
my @pre = (0, 1);
my $self;
$self = iterate {
@pre ? shift @pre : $self->get($_ - 2) + $self->get($_ - 1)
} 9**9**9
};
a few more examples:
my $fib = <0, 1, {$^a + $^b} ... *>;
my $fac = <1, * * _ ... *>;
my $int = <0, * + 1 ... *>;
my $fib = <0,1,*+*...>; # ending star is optional
- list comprehension strings match:
( .+ (for | [:|]) )? (range | iterate) ( (if | unless | [?,]) .+ )?
( (while | until ) .+ )?
examples:
<**2: 1 .. 10> ~~ gen {$_**2} range 1, 10
<**2: 1 .. 10 ? %2> ~~ gen {$_**2} filter {$_ % 2} range 1, 10
<sin: 0 .. 3.14 += 0.01> ~~ gen {sin} range 0, 3.14, 0.01
<1 .. 10 if % 2> ~~ filter {$_ % 2} range 1, 10
<sin for 0 .. 10 by 3 if /5/> ~~ gen {sin} filter {/5/} range 0, 10, 3
<*3 for 0 .. 10 unless %3> ~~ gen {$_ * 3} filter {not $_ % 3} 0, 10
<0 .. 100 while \< 10> ~~ While {$_ < 10} range 0, 100
<*2 for 0.. if %2 while \<10> ~~ <0..>->grep('%2')->while('<10')->map('*2')
there are three delimiter types available for basic list
comprehensions:
terse: <*2: 1.. ?%3>
haskell: <*2| 1.., %3>
verbose: <*2 for 1.. if %3>
you can mix and match "<*2 for 1..,
%3>", "<*2| 1..
?%3>"
in the above examples, most of the code areas are using
abbreviated syntax. here are a few equivalencies:
<*2:1..?%3> ~~ <*2 for 1.. if %3> ~~ <\$_ * 2 for 1 .. * if \$_ % 3>
<1.. if even> ~~ <1.. if not %2> ~~ <1..?!%2> ~~ <1.. if not _ % 2>
~~ <1.. unless %2> ~~ <1..* if not \$_ % 2>
<1.. if %2> ~~ <1.. if _%2> ~~ <1..* ?odd> ~~ <1.. ? \$_ % 2>
- reduction strings match:
\[operator | function_name\] (range | iterate | list comp)
examples:
say <[+] 1..10>; # prints 55
pre/post fixing the operator with '..' uses the
" scan " function instead of
" reduce "
my $fac = <[..*] 1..>; # read as "a running product of one to infinity"
my $sum = <[+]>; # no argument returns the reduction function
say $sum->(1 .. 10); # 55
say $sum->(<1..10>); # 55
my $rev_cat = <[R.]>; # prefix the operator with `R` to reverse it
say $rev_cat->(1 .. 9); # 987654321
- all of these features can be used together:
<[+..] *2 for 0 .. 100 by 2 unless %3 >
which is the same as:
range(0, 100, 2)->grep('not %3')->map('*2')->scan('+')
when multiple features are used together, the following
construction order is used:
1. prefix
2. range or iterate
3. if / unless (grep)
4. while / until (while)
5. for (map)
6. reduce / scan
([prefix] + (range|iterate))->grep(...)->while(...)->map(...)->reduce(...)
- bignums
when run in perl 5.9.4+, glob strings will honor the lexical
pragmas " bignum ",
" bigint ", and
" bigrat ".
*factorial = do {use bigint; <[..*] 1, 1..>->code};
say factorial(25); # 15511210043330985984000000
- special characters
since the angle brackets
("<" and
">") are used as delimiters of the
glob string, they both must be escaped with " \
" if used in the
"<...>" construct.
<1..10 if \< 5>->say; # 1 2 3 4
due to "<...>" being a
" qq{} " string, in the code areas if
you need to write $_ write it without the
sigil as " _ "
<1 .. 10 if _**2 \> 40>->say; # 7 8 9 10
it can be escaped " \$_ " as
well.
neither of these issues apply to calling glob directly with a
single quoted string:
glob('1..10 if $_ < 5')->say; # 1 2 3 4
- List::Gen " ... "
- the subroutine " Gen " in the package
" List:: " is a dwimmy function that
produces a generator from a variety of sources. since
" List::Gen " is a fully qualified name,
it is available from all packages without the need to import it.
if given only one argument, the following table describes what
is done:
array ref: List::Gen \@array ~~ makegen @array
code ref: List::Gen sub {$_**2} ~~ <0..>->map(sub {$_**2})
scalar ref: List::Gen \'*2' ~~ <0..>->map('*2')
glob string: List::Gen '1.. by 2' ~~ <1.. by 2>
glob string: List::Gen '0, 1, *+*' ~~ <0, 1, *+*...>
file handle: List::Gen $fh ~~ file $fh
if the argument does not match the table, or the method is
given more than one argument, the list is converted to a generator with
" list(...) "
List::Gen(1, 2, 3)->map('2**')->say; # 2 4 8
since it results in longer code than any of the equivalent
constructs, it is mostly for if you have not imported anything:
" use List::Gen (); "
- vecgen " [BITS] [SIZE] [DATA] "
- " vecgen " wraps a bit vector in a
generator. BITS defaults to 8. SIZE defaults to infinite. DATA defaults to
an empty string.
cells of the generator can be assigned to using array
dereferencing:
my $vec = vecgen;
$$vec[3] = 5;
or with the "->set(...)"
method:
$vec->set(3, 5);
- primes
- utilizing the same mechanism as the
"<1..>->grep('prime')"
construct, the " primes " function
returns an equivalent, but more efficiently constructed generator.
prime numbers below 1e7 are tested with a sieve of
eratosthenes and should be reasonably efficient. beyond that, simple
trial division is used.
" primes " always returns
the same generator.
- slice " SOURCE_GEN RANGE_GEN "
- " slice " uses "
RANGE_GEN " to generate the indices used to take a lazy slice
of " SOURCE_GEN ".
my $gen = gen {$_ ** 2};
my $s1 = slice $gen, range 1, 9**9**9;
my $s2 = slice $gen, <1..>;
my $s3 = $gen->slice(<1..>);
my $s4 = $gen->(<1..>);
$s1 ~~ $s2 ~~ $s3 ~~ $s4 ~~ $gen->tail
" slice " will perform some
optimizations if it detects that " RANGE_GEN
" is sufficiently simple (something like
" range $x, $y, 1 "). also, stacked
simple slices will collapse into a single slice, which turns repeated
tailing of a generator into a relatively efficient operation.
$gen->(<1..>)->(<1..>)->(<1..>) ~~ $gen->(<3..>) ~~ $gen->tail->tail->tail
- test " {CODE} [ARGS_FOR_GEN] "
- " test " attaches a code block to a
generator. it takes arguments suitable for the " gen
" function. accessing an element of the returned generator
will call the code block first with the element in $_
, and if it returns true, the element is returned, otherwise an
empty list (undef in scalar context) is returned.
when accessing a slice of a tested generator, if you use the
"->(x .. y)" syntax, the the empty
lists will collapse and you may receive a shorter slice. an array
dereference slice will always be the size you ask for, and will have
undef in each failed slot
the "$gen->nxt" method is
a version of "$gen->next" that
continues to call "->next" until a
call returns a value, or the generator is exhausted. this makes the
"->nxt" method the easiest way to
iterate over only the passing values of a tested generator.
- cache " {CODE} "
- cache " GENERATOR "
- cache "list => ..."
- " cache " will return a cached version
of the generators returned by functions in this package. when passed a
code reference, cache returns a memoized code ref (arguments joined with
$; ). when in 'list' mode, the source is in list
context, otherwise scalar context is used.
my $gen = cache gen {slow($_)} \@source; # calls = 0
print $gen->[123]; # calls += 1
...
print @$gen[123, 456] # calls += 1
- flip " GENERATOR "
- " flip " is "
reverse " for generators. the
"->apply" method is called on
" GENERATOR ".
"$gen->flip" and
"$gen->reverse" do the same thing.
flip gen {$_**2} 0, 10 ~~ gen {$_**2} 10, 0, -1
- expand " GENERATOR "
- expand " SCALE GENERATOR "
- " expand " scales a generator with
elements that return equal sized lists. it can be passed a list length, or
will automatically determine it from the length of the list returned by
the first element of the generator. " expand
" implicitly caches its returned generator.
my $multigen = gen {$_, $_/2, $_/4} 1, 10; # each element returns a list
say join ' '=> $$multigen[0]; # 0.25 # only last element
say join ' '=> &$multigen(0); # 1 0.5 0.25 # works
say scalar @$multigen; # 10
say $multigen->size; # 10
my $expanded = expand $multigen;
say join ' '=> @$expanded[0 .. 2]; # 1 0.5 0.25
say join ' '=> &$expanded(0 .. 2); # 1 0.5 0.25
say scalar @$expanded; # 30
say $expanded->size; # 30
my $expanded = expand gen {$_, $_/2, $_/4} 1, 10; # in one line
" expand " can also scale a
generator that returns array references:
my $refs = gen {[$_, $_.$_]} 3;
say $refs->join(', '); # ARRAY(0x272514), ARRAY(0x272524), ARRAY(0x272544)
say $refs->expand->join(', '); # 0, 00, 1, 11, 2, 22
" expand " in array ref mode
is the same as calling the
"->deref" method.
- contract " SCALE GENERATOR "
- " contract " is the inverse of
" expand "
also called " collect "
- scan " {CODE} GENERATOR "
- scan " {CODE} LIST "
- " scan " is a "
reduce " that builds a list of all the intermediate values.
" scan " returns a generator, and is the
function behind the "<[..+]>"
globstring reduction operator.
(scan {$a * $b} <1, 1..>)->say(8); # 1 1 2 6 24 120 720 5040 40320
say <[..*] 1, 1..>->str(8); # 1 1 2 6 24 120 720 5040 40320
say <1, 1..>->scan('*')->str(8); # 1 1 2 6 24 120 720 5040 40320
say <[..*]>->(1, 1 .. 7)->str; # 1 1 2 6 24 120 720 5040 40320
you can even use the
"->code" method to tersely define a
factorial function:
*factorial = <[..*] 1, 1..>->code;
say factorial(5); # 120
a stream version " scan_stream
" is also available.
- overlay " GENERATOR PAIRS "
- overlay allows you to replace the values of specific generator cells. to
set the values, either pass the overlay constructor a list of pairs in the
form "index => value, ...", or assign
values to the returned generator using normal array ref syntax
my $fib; $fib = overlay gen {$$fib[$_ - 1] + $$fib[$_ - 2]};
@$fib[0, 1] = (0, 1);
# or
my $fib; $fib = gen {$$fib[$_ - 1] + $$fib[$_ - 2]}
->overlay( 0 => 0, 1 => 1 );
print "@$fib[0 .. 15]"; # '0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610'
- recursive " [NAME] GENERATOR "
- " recursive " defines a subroutine named
" self(...) " or "
NAME(...) " during generator execution. when called with no
arguments it returns the generator. when called with one or more numeric
arguments, it fetches those indices from the generator. when called with a
generator, it returns a lazy slice from the source generator. since the
subroutine created by " recursive " is
installed at runtime, you must call the subroutine with parenthesis.
my $fib = gen {self($_ - 1) + self($_ - 2)}
->overlay( 0 => 0, 1 => 1 )
->cache
->recursive;
print "@$fib[0 .. 15]"; # '0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610'
when used as a method,
"$gen->recursive" can be shortened
to "$gen->rec".
my $fib = ([0, 1] + iterate {sum fib($_, $_ + 1)})->rec('fib');
print "@$fib[0 .. 15]"; # '0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610'
of course the fibonacci sequence is better written with the
glob syntax as "<0, 1, *+*...>"
which is compiled into something similar to the example with
" iterate " above.
- filter " {CODE} [ARGS_FOR_GEN] "
- " filter " is a lazy version of
" grep " which attaches a code block to
a generator. it returns a generator that will test elements with the code
block on demand. " filter " processes
its argument list the same way " gen "
does.
" filter " provides the
functionality of the identical
"->filter(...)" and
"->grep(...)" methods.
normal generators, such as those produced by
" range " or "
gen ", have a fixed length, and that is used to allow random
access within the range. however, there is no way to know how many
elements will pass a filter. because of this, random access within the
filter is not always O(1) .
" filter " will attempt to be as lazy
as possible, but to access the 10th element of a filter, the first 9
passing elements must be found first. depending on the coderef and the
source, the filter may need to process significantly more elements from
its source than just 10.
in addition, since filters don't know their true size, entire
filter arrays do not expand to the correct number of elements in list
context. to correct this, call the
"->apply" method which will test
the filter on all of its source elements. after that, the filter will
return a properly sized array. calling
"->apply" on an infinite (or very
large) range wouldn't be a good idea. if you are using
"->apply" frequently, you should
probably just be using " grep ". you
can call "->apply" on any stack of
generator functions, it will start from the deepest filter and move
up.
the method "->all" will
first call "->apply" on itself and
then return the complete list
filters implicitly cache their values. accessing any element
below the highest element already accessed is O(1)
.
accessing individual elements or slices works as you would
expect.
my $filter = filter {$_ % 2} 0, 100;
say $#$filter; # incorrectly reports 100
say "@$filter[5 .. 10]"; # reads the source range up to element 23
# prints 11 13 15 17 19 21
say $#$filter; # reports 88, closer but still wrong
$filter->apply; # reads remaining elements from the source
say $#$filter; # 49 as it should be
note: " filter " now reads
one element past the last element accessed, this allows filters to
behave properly when dereferenced in a foreach loop (without having to
call "->apply"). if you prefer the
old behavior, set " $List::Gen::LOOKAHEAD = 0
" or use " filter_ ...
"
- filter_stream " {CODE} ... "
- as " filter " runs, it builds up a cache
of the elements that pass the filter. this enables efficient random access
in the returned generator. sometimes this caching behavior causes certain
algorithms to use too much memory. " filter_stream
" is a version of " filter "
that does not maintain a cache.
normally, access to *_stream
iterators must be monotonically increasing since their source can only
produce values in one direction. filtering is a reversible algorithm,
and subsequently filter streams are able to rewind themselves to any
previous index. however, unlike " filter
", the " filter_stream "
generator must test previously tested elements to rewind. things
probably wont end well if the test code is non-deterministic or if the
source values are changing.
when used as a method, it can be spelled
"$gen->filter_stream(...)" or
"$gen->grep_stream(...)"
- While "{CODE} GENERATOR"
- Until "{CODE} GENERATOR"
- "While / ->while(...)" returns a new
generator that will end when its passed in subroutine returns false. the
" until " pair ends when the subroutine
returns true.
if $List::Gen::LOOKAHEAD is true
(the default), each reads one element past its requested element, and
saves this value only until the next call for efficiency, no other
values are saved. each supports random access, but is optimized for
sequential access.
these functions have all of the caveats of
" filter ", should be considered
experimental, and may change in future versions. the generator returned
should only be dereferenced in a " foreach
" loop, otherwise, just like a "
filter " perl will expand it to the wrong size.
the generator will return undef the first time an access is
made and the check code indicates it is past the end.
the generator will throw an error if accessed beyond its
dynamically found limit subsequent times.
my $pow = While {$_ < 20} gen {$_**2};
<0..>->map('**2')->while('< 20')
say for @$pow;
prints:
0
1
4
9
16
in general, it is faster to write it this way:
my $pow = gen {$_**2};
$gen->do(sub {
last if $_ > 20;
say;
});
- mutable " GENERATOR "
- "$gen->mutable"
- " mutable " takes a single fixed size
(immutable) generator, such as those produced by "
gen " and converts it into a variable size (mutable)
generator, such as those returned by " filter
".
as with filter, it is important to not use full array
dereferencing ( @$gen ) with mutable generators,
since perl will expand the generator to the wrong size. to access all of
the elements, use the "$gen->all"
method, or call "$gen->apply"
before @$gen . using a slice
@$gen[5 .. 10] is always ok, and does not require calling
"->apply".
mutable generators respond to the "
List::Gen::Done " exception, which can be produced with
either " done ",
" done_if ", or
" done_unless ". when the exception is
caught, it causes the generator to set its size, and it also triggers
any "->when_done" actions.
my $gen = mutable gen {done if $_ > 5; $_**2};
say $gen->size; # inf
say $gen->str; # 0 1 4 9 16 25
say $gen->size; # 6
generators returned from " mutable
" have a
"->set_size(int)" method that will
set the generator's size and then trigger any
"->when_done(sub{...})"
methods.
- done " [LAST_RETURN_VALUE] "
- throws an exception that will be caught by a mutable generator indicating
that the generator should set its size. if a value is passed to done, that
will be the final value returned by the generator, otherwise, the final
value will be the value returned on the previous call.
- done_if " COND VALUE "
- done_unless " COND VALUE "
- these are convenience functions for throwing " done
" exceptions. if the condition does not indicate
" done " then the function returns
" VALUE "
- strict " {CODE} "
- in the " CODE " block, calls to
functions or methods are subject to the following localizations:
- " local $List::Gen::LOOKAHEAD = 0; "
the functions " filter ",
" While " and their various forms
normally stay an element ahead of the last requested element so that an
array dereference in a " foreach "
loop ends properly. this localization disables this behavior, which
might be needed for certain algorithms. it is therefore important to
never write code like: "
for(@$strict_filtered){...} ", instead write
"$strict_filtered->do(sub{...})"
which is faster as well. the following code illustrates the difference
in behavior:
my $test = sub {
my $loud = filter {print "$_, "; $_ % 2};
print "($_:", $loud->next, '), ' for 0 .. 2;
print $/;
};
print 'normal: '; $test->();
print 'strict: '; strict {$test->()};
normal: 0, 1, 2, 3, (0:1), 4, 5, (1:3), 6, 7, (2:5),
strict: 0, 1, (0:1), 2, 3, (1:3), 4, 5, (2:5),
- " local $List::Gen::DWIM_CODE_STRINGS = 0;
"
in the dwim "$gen->(...)"
code deref syntax, if $DWIM_CODE_STRINGS has
been set to a true value, bare strings that look like code will be
interpreted as code and passed to " gen
" (string refs to " filter
"). since this behavior is fun for golf, but potentially
error prone, it is off by default. " strict
" turns it back off if it had been turned on.
" strict " returns what
" CODE " returns. "
strict " may have additional restrictions added to it in the
future.
- sequence " LIST "
- string generators, arrays, and scalars together.
" sequence " provides the
functionality of the overloaded " + "
operator on generators:
my $seq = <1 .. 10> + <20 .. 30> + <40 .. 50>;
is exactly the same as:
my $seq = sequence <1 .. 10>, <20 .. 30>, <40 .. 50>;
you can even write things like:
my $fib; $fib = [0, 1] + iterate {sum $fib->($_, $_ + 1)};
say "@$fib[0 .. 10]"; # 0 1 1 2 3 5 8 13 21 34 55
- zipgen " LIST "
- " zipgen " is a lazy version of
" zip ". it takes any combination of
generators and array refs and returns a generator. it is called
automatically when " zip " is used in
scalar context.
" zipgen " can be spelled
" genzip "
- unzip " LIST "
- " unzip " is the opposite of
" zip src1, src2 ". unzip returns 2
generators, the first returning src1, the second, src2. if
" LIST " is a single element, and is a
generator, that generator will be unzipped.
- unzipn " NUMBER LIST "
- "unzipn" is the n-dimentional precursor
of " unzip ". assuming a zipped list
produced by " zip " with
" n " elements, "
unzip n list" returns " n "
lists corresponding to the lists originally passed to
" zip ". if " LIST
" is a single element, and is a generator, that generator will
be unzipped. if only passed 1 argument, " unzipn
" will return a curried version of itself:
*unzip3 = unzipn 3;
my $zip3 = zip <1..>, <2..>, <3..>;
my ($x, $y, $z) = unzip3($zip3);
# $x == <1..>, $y == <2..>, $z == <3..>;
- zipgenmax " LIST "
- " zipgenmax " is a lazy version of
" zipmax ". it takes any combination of
generators and array refs and returns a generator.
- zipwith " {CODE} LIST"
- "zipwith" takes a code block and a list.
the "LIST" is zipped together and each
sub-list is passed to "CODE" when
requested. "zipwith" produces a
generator with the same length as its shortest source list.
my $triples = zipwith {\@_} <1..>, <20..>, <300..>;
say "@$_" for @$triples[0 .. 3];
1 20 300 # the first element of each list
2 21 301 # the second
3 22 302 # the third
4 23 303 # the fourth
- zipwithab "{AB_CODE} $gen1, $gen2"
- The zipwithab function takes a function which uses $a
and $b , as well as two lists and returns
a list analogous to zipwith.
- zipwithmax " {CODE} LIST "
- " zipwithmax " is a version of
" zipwith " that has the ending
conditions of " zipgenmax ".
- transpose " MULTI_DIMENSIONAL_ARRAY "
- transpose " LIST "
- " transpose " computes the 90 degree
rotation of its arguments, which must be a single multidimensional array
or generator, or a list of 1+ dimensional structures.
say transpose([[1, 2, 3]])->perl; # [[1], [2], [3]]
say transpose([[1, 1], [2, 2], [3, 3]])->perl; # [[1, 2, 3], [1, 2, 3]]
say transpose(<1..>, <2..>, <3..>)->take(5)->perl;
# [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7]]
- cartesian " {CODE} LIST "
- " cartesian " computes the cartesian
product of any number of array refs or generators, each which can be any
size. returns a generator
my $product = cartesian {$_[0] . $_[1]} [qw/a b/], [1, 2];
@$product == qw( a1 a2 b1 b2 );
- mapkey " {CODE} KEY LIST "
- this function is syntactic sugar for the following idiom
my @cartesian_product =
map {
my $first = $_;
map {
my $second = $_;
map {
$first . $second . $_
} 1 .. 3
} qw/x y z/
} qw/a b c/;
my @cartesian_product =
mapkey {
mapkey {
mapkey {
$_{first} . $_{second} . $_{third}
} third => 1 .. 3
} second => qw/x y z/
} first => qw/a b c/;
- mapab " {CODE} PAIRS "
- this function works like the builtin " map
" but consumes a list in pairs, rather than one element at a
time. inside the " CODE " block, the
variables $a and $b
are aliased to the elements of the list. if " mapab
" is called in void context, the " CODE
" block will be executed in void context for efficiency. if
" mapab " is passed an uneven length
list, in the final iteration, $b will be
" undef "
my %hash = (a => 1, b => 2, c => 3);
my %reverse = mapab {$b, $a} %hash;
- slide " {CODE} WINDOW LIST "
- slides a " WINDOW " sized slice over
" LIST ", calling
" CODE " for each slice and collecting
the result
as the window reaches the end, the passed in slice will
shrink
print slide {"@_\n"} 2 => 1 .. 4
# 1 2
# 2 3
# 3 4
# 4 # only one element here
- remove " {CODE} ARRAY|HASH "
- " remove " removes and returns elements
from its source when " CODE " returns
true. in the code block, if the source is an array, $_
is aliased to its elements. if the source is a hash,
$_ is aliased to its keys (and a list of the
removed "key => value" pairs are
returned).
my @array = (1, 7, 6, 3, 8, 4);
my @removed = remove {$_ > 5} @array;
say "@array"; # 1 3 4
say "@removed"; # 7 6 8
in list context, " remove "
returns the list of removed elements/pairs. in scalar context, it
returns the number of removals. " remove
" will not build a return list in void context for
efficiency.
- d " [SCALAR] "
- deref " [SCALAR] "
- dereference a " SCALAR ",
" ARRAY ", or "
HASH " reference. any other value is returned unchanged
print join " " => map deref, 1, [2, 3, 4], \5, {6 => 7}, 8, 9, 10;
# prints 1 2 3 4 5 6 7 8 9 10
- curse " HASHREF PACKAGE "
- many of the functions in this package utilize closure objects to avoid the
speed penalty of dereferencing fields in their object during each access.
" curse " is similar to
" bless " for these objects and while a
blessing makes a reference into a member of an existing package, a curse
conjures a new package to do the reference's bidding
package Closure::Object;
sub new {
my ($class, $name, $value) = @_;
curse {
get => sub {$value},
set => sub {$value = $_[1]},
name => sub {$name},
} => $class
}
"Closure::Object" is
functionally equivalent to the following normal perl object, but with
faster method calls since there are no hash lookups or other
dereferences (around 40-50% faster for short getter/setter type
methods)
package Normal::Object;
sub new {
my ($class, $name, $value) = @_;
bless {
name => $name,
value => $value,
} => $class
}
sub get {$_[0]{value}}
sub set {$_[0]{value} = $_[1]}
sub name {$_[0]{name}}
the trade off is in creation time / memory, since any good
curse requires drawing at least a few pentagrams in the blood of an
innocent package.
the returned object is blessed into the conjured package,
which inherits from the provided " PACKAGE
". always use
"$obj->isa(...)" rather than
" ref $obj eq ... " due to this. the
conjured package name matches
"/${PACKAGE}::_\d+/"
special keys:
-bless => $reference # returned instead of HASHREF
-overload => [fallback => 1, '""' => sub {...}]
when fast just isn't fast enough, since most cursed methods
don't need to be passed their object, the fastest way to call the method
is:
my $obj = Closure::Object->new('tim', 3);
my $set = $obj->{set}; # fetch the closure
# or $obj->can('set')
$set->(undef, $_) for 1 .. 1_000_000; # call without first arg
which is around 70% faster than pre-caching a method from a
normal object for short getter/setter methods, and is the method used
internally in this module.
- see List::Gen::Cookbook for usage tips.
- see List::Gen::Benchmark for performance tips.
- see List::Gen::Haskell for an experimental implementation of haskell's
lazy list behavior.
- see List::Gen::Lazy for the tools used to create List::Gen::Haskell.
- see List::Gen::Lazy::Ops for some of perl's operators implemented as lazy
haskell like functions.
- see List::Gen::Lazy::Builtins for most of perl's builtin functions
implemented as lazy haskell like functions.
- see List::Gen::Perl6 for a source filter that adds perl6's meta operators
to use with generators, rather than the default overloaded operators
version 0.90 added " glob " to the default
export list (which gives you syntactic ranges "<1 ..
10>" and list comprehensions.). version 0.90 also adds many new
features and bug-fixes, as usual, if anything is broken, please send in a bug
report. the ending conditions of " zip " and
" zipgen " have changed, see the
documentation above. " test " has been
removed from the default export list. setting
$List::Gen::LIST true to enable list context generators is no longer
supported and will now throw an error. " list
" has been added to the default export list.
" genzip " has been renamed
" zipgen "
version 0.70 comes with a bunch of new features, if anything is
broken, please let me know. see " filter "
for a minor behavior change
versions 0.50 and 0.60 break some of the syntax from previous
versions, for the better.
- code generation
- a number of the syntactic shortcuts that List::Gen provides will construct
and then evaluate code behind the scenes. Normally this is transparent,
but if you are trying to debug a problem, hidden code is never a good
thing. You can lexically enable the printing of evaled code with:
local $List::Gen::SAY_EVAL = 1;
my $fib = <0, 1, *+*...>;
# eval: ' @pre = (0, 1)' at (file.pl) line ##
# eval: 'List::Gen::iterate { if (@pre) {shift @pre}
# else { $fetch->(undef, $_ - 2) + $fetch->(undef, $_ - 1) }
# } 9**9**9' at (file.pl) line ##
my $gen = <1..10>->map('$_*2 + 1')->grep('some_predicate');
# eval: 'sub ($) {$_*2 + 1}' at (file.pl) line ##
# eval: 'sub ($) {some_predicate($_)}' at (file.pl) line ##
a given code string is only evaluated once and is then cached,
so you will not see any additional output when using the same code
strings in multiple places. in some cases (like the iterate example
above) the code is closing over external variables (
@pre and $fetch ) so you will not be
able to see everything, but $SAY_EVAL should
be a helpful debugging aid.
any time that code evaluation fails, an immediate fatal error
is thrown. the value of $SAY_EVAL does not
matter in that case.
- captures of compile time constructed lists
- the " cap " function and its twin
operator " &\ " are faster than the
" [...] " construct because they do not
copy their arguments. this is why the elements of the captures remain
aliased to their arguments. this is normally fine, but it has an
interesting effect with compile time constructed constant lists:
my $max = 1000;
my $range = & \(1 .. $max); # 57% faster than [1 .. $max]
my $nums = & \(1 .. 1000); # 366% faster than [1 .. 1000], but cheating
the first example shows the expected speed increase due to not
copying the values into a new empty array reference. the second example
is much faster at runtime than the " [...]
" syntax, but this speed is deceptive. the reason is that
the list being passed in as an argument is generated by the compiler
before runtime begins. so all perl has to do is place the values on the
stack, and call the function.
normally this is fine, but there is one catch to be aware of,
and that is that a capture of a compile time constant list in a loop or
subroutine (or any structure that can execute the same segment of code
repeatedly) will always return a reference to an array of the same
elements.
# two instances give two separate arrays
my ($a, $b) = (&\(1 .. 3), &\(1 .. 3));
$_ += 10 for @$a;
say "@$a : @$b"; # 11 12 13 : 1 2 3
# here the one instance returns the same elements twice
my ($x, $y) = map &\(1 .. 3), 1 .. 2;
$_ += 10 for @$x;
say "@$x : @$y"; # 11 12 13 : 11 12 13
this only applies to compile time constructed constant lists,
anything containing a variable or non constant function call will give
you separate array elements, as shown below:
my ($low, $high) = (1, 3);
my ($x, $y) = map &\($low .. $high), 1 .. 2; # non constant list
$_ += 10 for @$x;
say "@$x : @$y"; # 11 12 13 : 1 2 3
Eric Strom, "<asg at cpan.org>"
overloading has gotten fairly complicated and is probably in need of a rewrite.
if any edge cases do not work, please send in a bug report.
both threaded methods
("$gen->threads_slice(...)") and
function composition with overloaded operators (made with
"List::Gen::Lazy::fn {...}") do not work
properly in versions of perl before 5.10. patches welcome
report any bugs / feature requests to
"bug-list-gen at rt.cpan.org", or through
the web interface at
<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=List-Gen>.
comments / feedback / patches are also welcome.
copyright 2009-2011 Eric Strom.
this program is free software; you can redistribute it and/or
modify it under the terms of either: the GNU General Public License as
published by the Free Software Foundation; or the Artistic License.
see http://dev.perl.org/licenses/ for more information.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |