|
NAMEGetopt::Tabular - table-driven argument parsing for Perl 5SYNOPSISuse Getopt::Tabular; (or) use Getopt::Tabular qw/GetOptions SetHelp SetHelpOption SetError GetError/; ... &Getopt::Tabular::SetHelp (long_help, usage_string); @opt_table = ( [section_description, "section"], [option, type, num_values, option_data, help_string], ... ); &GetOptions (\@opt_table, \@ARGV [, \@newARGV]) || exit 1; DESCRIPTIONGetopt::Tabular is a Perl 5 module for table-driven argument parsing, vaguely inspired by John Ousterhout's Tk_ParseArgv. All you really need to do to use the package is set up a table describing all your command-line options, and call &GetOptions with three arguments: a reference to your option table, a reference to @ARGV (or something like it), and an optional third array reference (say, to @newARGV). &GetOptions will process all arguments in @ARGV, and copy any leftover arguments (i.e. those that are not options or arguments to some option) to the @newARGV array. (If the @newARGV argument is not supplied, "GetOptions" will replace @ARGV with the stripped-down argument list.) If there are any invalid options, "GetOptions" will print an error message and return 0.Before I tell you all about why Getopt::Tabular is a wonderful thing, let me explain some of the terminology that will keep popping up here.
FEATURESNow for the advertising, i.e. why Getopt::Tabular is a good thing.
In general, I have found that Getopt::Tabular tends to encourage programs with long lists of sophisticated options, leading to great flexibility, intelligent operation, and the potential for insanely long command lines. BASIC OPERATIONThe basic operation of Getopt::Tabular is driven by an option table, which is just a list of option descriptions (otherwise known as option table entries, or just entries). Each option description tells "GetOptions" everything it needs to know when it encounters a particular option on the command line. For instance,["-foo", "integer", 2, \@Foo, "set the foo values"] means that whenever "-foo" is seen on the command line, "GetOptions" is to make sure that the next two arguments are integers, and copy them into the caller's @Foo array. (Well, really into the @Foo array where the option table is defined. This is almost always the same as "GetOptions"' caller, though.) Typically, you'll group a bunch of option descriptions together like this: @options = (["-range", "integer", 2, \@Range, "set the range of allowed values"], ["-file", "string", 1, \$File, "set the output file"], ["-clobber", "boolean", 0, \$Clobber, "clobber existing files"], ... ); and then call "GetOptions" like this: &GetOptions (\@options, \@ARGV) || exit 1; which replaces @ARGV with a new array containing all the arguments left-over after options and their arguments have been removed. You can also call "GetOptions" with three arguments, like this: &GetOptions (\@options, \@ARGV, \@newARGV) || exit 1; in which case @ARGV is untouched, and @newARGV gets the leftover arguments. In case of error, "GetOptions" prints enough information for the user to figure out what's going wrong. If you supply one, it'll even print out a brief usage message in case of error. Thus, it's enough to just "exit 1" when "GetOptions" indicates an error by returning 0. Detailed descriptions of the contents of an option table entry are given next, followed by the complete run-down of available types, full details on error handling, and how help text is generated. OPTION TABLE ENTRIESThe fields in the option table control how arguments are parsed, so it's important to understand each one in turn. First, the format of entries in the table is fairly rigid, even though this isn't really necessary with Perl. It's done that way to make the Getopt::Tabular code a little easier; the drawback is that some entries will have unused values (e.g. the "num_values" field is never used for boolean options, but you still have to put something there as a place-holder). The fields are as follows:
OPTION TYPESThe option type field is the single-most important field in the table, as the type for an option "-foo" determines (along with num_values) what action "GetOptions" takes when it sees "-foo" on the command line: how many following arguments become "-foo"'s arguments, what regular expression those arguments must conform to, or whether some other action should be taken.As mentioned above, there are three main classes of argument types:
Argument-driven option types
Constant-valued option types
Other option types
ERROR HANDLINGGenerally, handling errors in the argument list is pretty transparent: "GetOptions" (or one of its minions) generates an error message and assigns an error class, "GetOptions" prints the message to the standard error, and returns 0. You can access the error class and error message using the "GetError" routine:($err_class, $err_msg) = &Getopt::Tabular::GetError (); (Like "SetError", "GetError" can also be exported from Getopt::Tabular.) The error message is pretty simple---it is an explanation for the end user of what went wrong, which is why "GetOptions" just prints it out and forgets about it. The error class is further information that might be useful for your program; the current values are:
Note that most of these are errors on the end user's part, such as bad or missing arguments. There are also errors that can be caused by you, the programmer, such as bad or missing values in the option table; these generally result in "GetOptions" croaking so that your program dies immediately with enough information that you can figure out where the mistake is. bad_eval is a borderline case; there are conceivably cases where the end user's input can result in bogus code to evaluate, so I grouped this one in the "user errors" class. Finally, asking for help isn't really an error, but the assumption is that you probably shouldn't continue normal processing after printing out the help---so "GetOptions" returns 0 in this case. You can always fetch the error class with "GetError" if you want to treat real errors differently from help requests. HELP TEXTOne of Getopt::Tabular's niftier features is the ability to generate and format a pile of useful help text from the snippets of help you include in your option table. The best way to illustrate this is with a couple of brief examples. First, it's helpful to know how the user can trigger a help display. This is quite simple: by default, "GetOptions" always has a "-help" option, presence of which on the command line triggers a help display. (Actually, the help option is really your preferred option prefix plus "help". So, if you like to make GNU-style options to take precedence as follows:&Getopt::Tabular::SetOptionPatterns qw|(--)([\w-]+) (-)(\w+)|; then the help option will be "--help". There is only one help option available, and you can set it by calling &SetHelpOption (another optional export). Note that in addition to the option help embedded in the option table, "GetOptions" can optionally print out two other messages: a descriptive text (usually a short paragraph giving a rough overview of what your program does, possibly referring the user to the fine manual page), and a usage text. These are both supplied by calling &SetHelp, e.g. $Help = <<HELP; This is the foo program. It reads one file (specified by -infile), operates on it some unspecified way (possibly modified by -threshold), and does absolutely nothing with the results. (The utility of the -clobber option has yet to be established.) HELP $Usage = <<USAGE; usage: foo [options] foo -help to list options USAGE &Getopt::Tabular::SetHelp ($Help, $Usage) Note that either of the long help or usage strings may be empty, in which case "GetOptions" simply won't print them. In the case where both are supplied, the long help message is printed first, followed by the option help summary, followed by the usage. "GetOptions" inserts enough blank lines to make the output look just fine on its own, so you shouldn't pad either the long help or usage message with blanks. (It looks best if each ends with a newline, though, so setting the help strings with here-documents---as in this example---is the recommended approach.) As an example of the help display generated by a typical option table, let's take a look at the following: $Verbose = 1; $Clobber = 0; undef $InFile; @Threshold = (0, 1); @argtbl = (["-verbose|-quiet", "boolean", 0, \$Verbose, "be noisy"], ["-clobber", "boolean", 0, \$Clobber, "overwrite existing files"], ["-infile", "string", 1, \$InFile, "specify the input file from which to read a large " . "and sundry variety of data, to which many " . "interesting operations will be applied", "<f>"], ["-threshold", "float", 2, \@Threshold, "only consider values between <v1> and <v2>", "<v1> <v2>"]); Assuming you haven't supplied long help or usage strings, then when "GetOptions" encounters the help option, it will immediately stop parsing arguments and print out the following option summary: Summary of options: -verbose be noisy [default] -quiet opposite of -verbose -clobber overwrite existing files -noclobber opposite of -clobber [default] -infile <f> specify the input file from which to read a large and sundry variety of data, to which many interesting operations will be applied -threshold <v1> <v2> only consider values between <v1> and <v2> [default: 0 1] There are a number of interesting things to note here. First, there are three option table fields that affect the generation of help text: option, help_string, and argdesc. Note how the argdesc strings are simply option placeholders, usually used to 1) indicate how many values are expected to follow an option, 2) (possibly) imply what form they take (although that's not really shown here), and 3) explain the exact meaning of the values in the help text. argdesc is just a string like the help string; you can put whatever you like in it. What I've shown above is just my personal preference (which may well evolve). A new feature with version 0.3 of Getopt::Tabular is the inclusion of default values with the help for certain options. A number of conditions must be fulfilled for this to happen for a given option: first, the option type must be one of the "argument-driven" types, such as "integer", "float", "string", or a user-defined type. Second, the option data field must refer either to a defined scalar value (for scalar-valued options) or to a list of one or more defined values (for vector-valued options). Thus, in the above example, the "-infile" option doesn't have its default printed because the $InFile scalar is undefined. Likewise, if the @Threshold array were the empty list "()", or a list of undefined values "(undef,undef)", then the default value for "-threshold" also would not have been printed. The formatting is done as follows: enough room is made on the right hand side for the longest option name, initially omitting the argument placeholders. Then, if an option has placeholders, and there is room for them in between the option and the help string, everything (option, placeholders, help string) is printed together. An example of this is the "-infile" option: here, "-infile <f>" is just small enough to fit in the 12-character column (10 characters because that is the length of the longest option, and 2 blanks), so the help text is placed right after it on the same line. However, the "-threshold" option becomes too long when its argument placeholders are appended to it, so the help text is pushed onto the next line. In any event, the help string supplied by the caller starts at the same column, and is filled to make a nice paragraph of help. "GetOptions" will fill to the width of the terminal (or 80 columns if it fails to find the terminal width). Finally, you can have pseudo entries of type section, which are important to make long option lists readable (and one consequence of using Getopt::Tabular is programs with ridiculously long option lists -- not altogether a bad thing, I suppose). For example, this table fragment: @argtbl = (..., ["-foo", "integer", 1, \$Foo, "set the foo value", "f"], ["-enterfoomode", "call", 0, \&enter_foo_mode, "enter foo mode"], ["Non-foo related options", "section"], ["-bar", "string", 2, \@Bar, "set the bar strings (which have nothing whatsoever " . "to do with foo", "<bar1> <bar2>"], ...); results in the following chunk of help text: -foo f set the foo value -enterfoomode enter foo mode -- Non-foo related options --------------------------------- -bar b1 b2 set the bar strings (which have nothing whatsoever to do with foo (This example also illustrates a slightly different style of argument placeholder. Take your pick, or invent your own!) SPOOF MODESince callbacks from the command line ("call" and "eval" options) can do anything, they might be quite expensive. In certain cases, then, you might want to make an initial pass over the command line to ensure that everything is OK before parsing it "for real" and incurring all those expensive callbacks. Thus, "Getopt::Tabular" provides a "spoof" mode for parsing a command line without side-effects. In the simplest case, you can access spoof mode like this:use Getopt::Tabular qw(SpoofGetOptions GetOptions); . . . &SpoofGetOptions (\@options, \@ARGV, \@newARGV) || exit 1; and then later on, you would call "GetOptions" with the original @ARGV (so it can do what "SpoofGetOptions" merely pretended to do): &GetOptions (\@options, \@ARGV, \@newARGV) || exit 1; For most option types, any errors that "GetOptions" would catch should also be caught by "SpoofGetOptions" -- so you might initially think that you can get away without that "|| exit 1" after calling "GetOptions". However, it's a good idea for a couple of reasons. First, you might inadvertently changed @ARGV -- this is usually a bug and a silly thing to do, so you'd probably want your program to crash loudly rather than fail mysteriously later on. Second, and more likely, some of those expensive operations that you're initially avoiding by using "SpoofGetOptions" might themselves fail -- which would cause "GetOptions" to return false where "SpoofGetOption" completes without a problem. (Finally, there's the faint possiblity of bugs in "Getopt::Tabular" that would cause different behaviour in spoof mode and real mode -- this really shouldn't happen, though.) In reality, using spoof mode requires a bit more work. In particular, the whole reason for spoof argument parsing is to avoid expensive callbacks, but since callbacks can eat any number of command line arguments, you have to emulate them in some way. It's not possible for "SpoofGetOptions" to do this for you, so you have to help out by supplying "spoof" callbacks. As an example, let's say you have a callback option that eats one argument (a filename) and immediately reads that file: @filedata = (); sub read_file { my ($opt, $args) = @_; warn ("$opt option requires an argument\n"), return 0 unless @$args; my $file = shift @$args; open (FILE, $file) || (warn ("$file: $!\n"), return 0); push (@filedata, <FILE>); close (FILE); return 1; } @options = (['-read_file', 'call', undef, \&read_file]); Since "-read_file" could occur any number of times on the command line, we might end up reading an awful lot of files, and thus it might be a long time before we catch errors late in the command line. Thus, we'd like to do a "spoof" pass over the command line to catch all errors. A simplistic approach would be to supply a spoof callback that just eats one argument and returns success: sub spoof_read_file { my ($opt, $args) = @_; (warn ("$opt option requires an argument\n"), return 0) unless @$args; shift @$args; return 1; } Then, you have to tell "Getopt::Tabular" about this alternate callback with no side-effects (apart from eating that one argument): &Getopt::Tabular::SetSpoofCodes (-read_file => \&spoof_read_file); ("SetSpoofCodes" just takes a list of key/value pairs, where the keys are "call" or "eval" options, and the values are the "no side-effects" callbacks. Naturally, the replacement callback for an "eval" option should be a string, and for a "call" option it should be a code reference. This is not actually checked, however, until you call "SpoofGetOptions", because "SetSpoofCodes" doesn't know whether options are "call" or "eval" or what.) A more useful "spoof_read_file", however, would actually check if the requested file exists -- i.e., we should try to catch as many errors as possible, as early as possible: sub spoof_read_file { my ($opt, $args) = @_; warn ("$opt option requires an argument\n"), return 0 unless @$args; my $file = shift @$args; warn ("$file does not exist or is not readable\n"), return 0 unless -r $file; return 1; } Finally, you can frequently merge the "real" and "spoof" callback into one subroutine: sub read_file { my ($opt, $args, $spoof) = @_; warn ("$opt option requires an argument\n"), return 0 unless @$args; my $file = shift @$args; warn ("$file does not exist or is not readable\n"), return 0 unless -r $file; return 1 if $spoof; open (FILE, $file) || (warn ("$file: $!\n"), return 0); push (@filedata, <FILE>); close (FILE); return 1; } And then, when specifying the replacement callback to "SetSpoofCodes", just create an anonymous sub that calls "read_file" with $spoof true: &Getopt::Tabular::SetSpoofCodes (-read_file => sub { &read_file (@_[0,1], 1) }); Even though this means a bigger and more complicated callback, you only need one such callback -- the alternative is to carry around both "read_file" and "spoof_read_file", which might do redundant processing of the argument list. AUTHORGreg Ward <greg@bic.mni.mcgill.ca>Started in July, 1995 as ParseArgs.pm, with John Ousterhout's Tk_ParseArgv.c as a loose inspiration. Many many features added over the ensuing months; documentation written in a mad frenzy 16-18 April, 1996. Renamed to Getopt::Tabular, revamped, reorganized, and documentation expanded 8-11 November, 1996. Copyright (c) 1995-97 Greg Ward. All rights reserved. This is free software; you can redistribute it and/or modify it under the same terms as Perl itself. BUGSThe documentation is bigger than the code, and I still haven't covered option patterns or extending the type system (apart from pattern types). Yow!No support for list-valued options, although you can roll your own with call options. (See the demo program included with the distribution for an example.) Error messages are hard-coded to English.
Visit the GSP FreeBSD Man Page Interface. |