|
NAMEList::BinarySearch - Binary Search within a sorted array.SYNOPSISThis module performs a binary search on an array.Examples: use List::BinarySearch qw( :all ); # ... or ... use List::BinarySearch qw( binsearch binsearch_pos binsearch_range ); # Some ordered arrays to search within. @num_array = ( 100, 200, 300, 400, 500 ); @str_array = qw( Bach Beethoven Brahms Mozart Schubert ); # Find the lowest index of a matching element. $index = binsearch {$a <=> $b} 300, @num_array; $index = binsearch {$a cmp $b} 'Mozart', @str_array; # Stringy cmp. $index = binsearch {$a <=> $b} 42, @num_array; # not found: undef # Find the lowest index of a matching element, or best insert point. $index = binsearch_pos {$a cmp $b} 'Chopin', @str_array; # Insert at [3]. $index = binsearch_pos {$a <=> $b} 600, @num_array; # Insert at [5]. splice @num_array, $index, 0, 600 if( $num_array[$index] != 600 ); # Insertion at [5] $index = binsearch_pos { $a <=> $b } 200, @num_array; # Matched at [1]. # The following functions return an inclusive range. my( $low_ix, $high_ix ) = binsearch_range { $a cmp $b } 'Beethoven', 'Mozart', @str_array; # Returns ( 1, 3 ), meaning ( 1 .. 3 ). my( $low_ix, $high_ix ) = binsearch_range { $a <=> $b } 200, 400, @num_array; DESCRIPTIONA binary search searches sorted lists using a divide and conquer technique. On each iteration the search domain is cut in half, until the result is found. The computational complexity of a binary search is O(log n).The binary search algorithm implemented in this module is known as a Deferred Detection variant on the traditional Binary Search. Deferred Detection provides stable searches. Stable binary search algorithms have the following characteristics, contrasted with their unstable binary search cousins:
This module has a companion "XS" module: List::BinarySearch::XS which users are strongly encouraged to install as well. If List::BinarySearch::XS is also installed, "binsearch" and "binsearch_pos" will use XS code. This behavior may be overridden by setting $ENV{List_BinarySearch_PP} to a true value. Most CPAN installers will either automatically install the XS module, or prompt to automatically install it. See CONFIGURATION for details. RATIONALEA binary search is pretty simple, right? Why use a module for this?Quoting from Wikipedia <http://en.wikipedia.org/wiki/Binary_search_algorithm>: When Jon Bentley assigned it as a problem in a course for professional programmers, he found that an astounding ninety percent failed to code a binary search correctly after several hours of working on it, and another study shows that accurate code for it is only found in five out of twenty textbooks. Furthermore, Bentley's own implementation of binary search, published in his 1986 book Programming Pearls, contains an error that remained undetected for over twenty years. So to answer the question, you might use a module so that you don't have to write, test, and debug your own implementation. Perl has "grep", hashes, and other alternatives, right? Yes, before using this module the user should weigh the other options such as linear searches ( "grep" or "List::Util::first" ), or hash based searches. A binary search requires an ordered list, so one must weigh the cost of sorting or maintaining the list in sorted order. Ordered lists have O(n) time complexity for inserts. Binary Searches are best when the data set is already ordered, or will be searched enough times to justify the cost of an initial sort. There are cases where a binary search may be an excellent choice. Finding the first matching element in a list of 1,000,000 items with a linear search would have a worst-case of 1,000,000 iterations, whereas the worst case for a binary search of 1,000,000 elements is about 20 iterations. In fact, if many lookups will be performed on a seldom-changed list, the savings of O(log n) lookups may outweigh the cost of sorting or performing occasional linear time inserts. EXPORTNothing is exported by default. "binsearch", "binsearch_pos", and "binsearch_range" may be exported by listing them on the export list.Or import all functions by specifying ":all". SUBROUTINES/METHODSWHICH SEARCH ROUTINE TO USE
binsearch CODE NEEDLE ARRAY_HAYSTACK$first_found_ix = binsearch { $a cmp $b } $needle, @haystack; Pass a code block, a search target, and an array to search. Uses the supplied code block $needle to test the needle against elements in @haystack. See the section entitled The Callback Block, below, for an explanation of how the comparator works (hint: very similar to "sort { $a <=> $b } ..." ). Return value will be the lowest index of an element that matches target, or undef if target isn't found. binsearch_pos CODE NEEDLE ARRAY_HAYSTACK$first_found_ix = binsearch_pos { $a cmp $b } $needle, @haystack; The only difference between this function and "binsearch" is its return value upon failure. "binsearch" returns undef upon failure. "binsearch_pos" returns the index of a valid insert point for $needle. Pass a code block, a search target, and an array to search. Uses the code block to test $needle against elements in @haystack. Return value is the index of the first element equaling $needle. If no element is found, the best insert-point for $needle is returned. binsearch_range CODE LOW_NEEDLE HIGH_NEEDLE ARRAY_HAYSTACKmy( $low, $high ) = binsearch_range { $a <=> $b }, $low_needle, $high_needle, @haystack; Given $low_needle and $high_needle, returns a set of indices that represent the range of elements fitting within $low_needle and $high_needle's bounds. This might be useful, for example, in finding all transations that occurred between 02012013 and 02292013. This algorithm was adapted from Mastering Algorithms with Perl, page 172 and 173. The callback block (The comparator)Comparators in List::BinarySearch are used to compare the target (needle) with individual haystack elements, returning the result of the relational comparison of the two values. A good example would be the code block in a "sort" function.Basic comparators might be defined like this: # Numeric comparisons: binsearch { $a <=> $b } $needle, @haystack; # Stringwise comparisons: binsearch { $a cmp $b } $needle, @haystack; # Unicode Collation Algorithm comparisons $Collator = Unicode::Collate->new; binsearch { $Collator->( $a, $b ) } $needle, @haystack; $a represents the target. $b represents the contents of the haystack element being tested. This leads to an asymmetry that might be prone to "gotchas" when writing custom comparators for searching complex data structures. As an example, consider the following data structure: my @structure = ( [ 100, 'ape' ], [ 200, 'cat' ], [ 300, 'dog' ], [ 400, 'frog' ] ); A numeric custom comparator for such a data structure would look like this: sub{ $a <=> $b->[0] } In this regard, the callback is unlike "sort", because "sort" is always comparing to elements, whereas "binsearch" is comparing a target with an element. Just as with "sort", the comparator must return -1, 0, or 1 to signify "less than", "equal to", or "greater than". DATA SET REQUIREMENTSA well written general algorithm should place as few demands on its data as practical. The requirements that these Binary Search algorithms impose are:
UNICODE SUPPORTLists sorted according to the Unicode Collation Algorithm must be searched using the same Unicode Collation Algorithm, Here's an example using Unicode::Collate's "$Collator->cmp($a,$b)":my $found_index = binsearch { $Collator->cmp($a, $b) } $needle, @haystack; CONFIGURATION AND ENVIRONMENTList::BinarySearch is a thin layer that will attempt to load List::BinarySearch::XS first, and if that module is unavailable, will fall back to List::BinarySearch::PP which is provided with this distribution.Most CPAN installers will automatically pull in and install the XS plugin for this module. If in interactive mode, they will prompt first. To override default behavior, forcing CPAN installers to not pull in the XS module, set the environment variable LBS_NO_XS true prior to installation. There will be no prompt, and the XS module won't be pulled in and installed. However, the List::BinarySearch::XS plugin is strongly recommended, and should only be skipped in environments where XS modules cannot be compiled. If one wishes to install the XS module beforhand, or at any time later on, just installing it in the usual fashion is sufficient for List::BinarySearch to recognize and start using it. By installing List::BinarySearch::XS, the pure-Perl versions of "binsearch" and "binsearch_pos" will be automatically replaced with XS versions for markedly improved performance. "binsearch_range" also benefits from the XS plug-in, since internally it makes calls to "binsearch_pos". Users are strongly advised to install List::BinarySearch::XS. If, after installing List::BinarySearch::XS, one wishes to disable the XS plugin, setting $ENV{List_BinarySearch_PP} to a true value will prevent the XS module from being used by List::BinarySearch. This setting will have no effect on users who use List::BinarySearch::XS directly. For the sake of code portability, it is recommended to use List::BinarySearch as the front-end, as it will automatically and portably downgrade to the pure-Perl version if the XS module can't be loaded. DEPENDENCIESThis module uses Exporter, and automatically makes use of List::BinarySearch::XS if it's installed on the user's system.This module will attempt to install List::BinarySearch::XS unless the environment variable "LBS_NO_XS" is set prior to install, or if in interactive mode, the user opts to skip this recommended step. This module supports Perl versions 5.8 and newer. The optional XS extension also supports Perl 5.8 and newer. INCOMPATIBILITIESThis module is incompatible with Perl versions prior to 5.8. In particular, its use of prototypes isn't compatible with Perl 5.6 or older. It would be easy to eliminate the use of prototypes, but doing so would change calling syntax.DIAGNOSTICSSEE ALSOList::BinarySearch::XS: An XS plugin for this module; install it, and this module will use it automatically for a nice performance improvement. May also be used on its own.AUTHORDavid Oswald, "<davido at cpan.org>"If the documentation fails to answer a question, or if you have a comment or suggestion, send me an email. BUGS AND LIMITATIONSPlease report any bugs or feature requests to <https://github.com/daoswald/List-BinarySearch/issues>. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.SUPPORTYou can find documentation for this module with the perldoc command.perldoc List::BinarySearch This module is maintained in a public repo at Github. You may look for information at:
ACKNOWLEDGEMENTSThank-you to those who provided advice on user interface and XS interoperability.Mastering Algorithms with Perl <http://shop.oreilly.com/product/9781565923980.do>, from O'Reilly <http://www.oreilly.com>: for the inspiration (and much of the code) behind the positional and ranged searches. Quoting MAwP: "...the binary search was first documented in 1946 but the first algorithm that worked for all sizes of array was not published until 1962." (A summary of a passage from Knuth: Sorting and Searching, 6.2.1.) Although the basic idea of binary search is comparatively straightforward, the details can be surprisingly tricky... -- Donald Knuth LICENSE AND COPYRIGHTCopyright 2012 David Oswald.This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License. See http://dev.perl.org/licenses/ for more information.
Visit the GSP FreeBSD Man Page Interface. |