|
|
| |
LW2(3) |
User Contributed Perl Documentation |
LW2(3) |
LW2 - Perl HTTP library version 2.5
use LW2;
require 'LW2.pm';
Libwhisker is a Perl library useful for HTTP testing scripts. It contains a
pure-Perl reimplementation of functionality found in the
"LWP",
"URI",
"Digest::MD5",
"Digest::MD4",
"Data::Dumper",
"Authen::NTLM",
"HTML::Parser",
"HTML::FormParser",
"CGI::Upload",
"MIME::Base64", and
"GetOpt::Std" modules.
Libwhisker is designed to be portable (a single perl file), fast
(general benchmarks show libwhisker is faster than LWP), and flexible (great
care was taken to ensure the library does exactly what you want to do, even
if it means breaking the protocol).
The following are the functions contained in Libwhisker:
- auth_brute_force
- Params: $auth_method, \%req,
$user, \@passwords [,
$domain, $fail_code ]
Return: $first_valid_password, undef
if error/none found
Perform a HTTP authentication brute force against a server
(host and URI defined in %req). It will try
every password in the password array for the given user. The first
password (in conjunction with the given user) that doesn't return HTTP
401 is returned (and the brute force is stopped at that point). You
should retry the request with the given password and double-check that
you got a useful HTTP return code that indicates successful
authentication (200, 302), and not something a bit more abnormal (407,
500, etc). $domain is optional, and is only used
for NTLM auth.
Note: set up any proxy settings and proxy auth in
%req before calling this function.
You can brute-force proxy authentication by setting up the
target proxy as proxy_host and proxy_port in
%req, using an arbitrary host and uri
(preferably one that is reachable upon successful proxy authorization),
and setting the $fail_code to 407. The
$auth_method passed to this function should be a
proxy-based one ('proxy-basic', 'proxy-ntlm', etc).
if your server returns something other than 401 upon auth
failure, then set $fail_code to whatever is
returned (and it needs to be something *different* than what is received
on auth success, or this function won't be able to tell the
difference).
- auth_unset
- Params: \%req
Return: nothing (modifies %req)
Modifes %req to disable all
authentication (regular and proxy).
Note: it only removes the values set by auth_set().
Manually-defined [Proxy-]Authorization headers will also be deleted (but
you shouldn't be using the auth_* functions if you're manually handling
your own auth...)
- auth_set
- Params: $auth_method, \%req,
$user, $password [,
$domain]
Return: nothing (modifies %req)
Modifes %req to use the indicated
authentication info.
Auth_method can be: 'basic', 'proxy-basic', 'ntlm',
'proxy-ntlm'.
Note: this function may not necessarily set any headers after
being called. Also, proxy-ntlm with SSL is not currently supported.
- cookie_new_jar
- Params: none
Return: $jar
Create a new cookie jar, for use with the other functions.
Even though the jar is technically just a hash, you should still use
this function in order to be future-compatible (should the jar format
change).
- cookie_read
- Params: $jar, \%response [, \%request,
$reject ]
Return: $num_of_cookies_read
Read in cookies from an %response
hash, and put them in $jar.
Notice: cookie_read uses internal magic done by
http_do_request in order to read cookies regardless of 'Set-Cookie[2]'
header appearance.
If the optional %request hash is
supplied, then it will be used to calculate default host and path
values, in case the cookie doesn't specify them explicitly. If
$reject is set to 1, then the
%request hash values are used to calculate and
reject cookies which are not appropriate for the path and domains of the
given request.
- cookie_parse
- Params: $jar, $cookie [,
$default_domain,
$default_path, $reject ]
Return: nothing
Parses the cookie into the various parts and then sets the
appropriate values in the cookie $jar. If the
cookie value is blank, it will delete it from the
$jar. See the 'docs/cookies.txt' document for a
full explanation of how Libwhisker parses cookies and what RFC aspects
are supported.
The optional $default_domain value is
taken literally. Values with no leading dot (e.g. 'www.host.com') are
considered to be strict hostnames and will only match the identical
hostname. Values with leading dots (e.g. '.host.com') are treated as
sub-domain matches for a single domain level. If the cookie does not
indicate a domain, and a $default_domain is not
provided, then the cookie is considered to match all domains/hosts.
The optional $default_path is used
when the cookie does not specify a path.
$default_path must be absolute (start with '/'),
or it will be ignored. If the cookie does not specify a path, and
$default_path is not provided, then the default
value '/' will be used.
Set $reject to 1 if you wish to reject
cookies based upon the provided $default_domain
and $default_path. Note that
$default_domain and
$default_path must be specified for
$reject to actually do something meaningful.
- cookie_write
- Params: $jar, \%request,
$override
Return: nothing
Goes through the given $jar and sets
the Cookie header in %req pending the correct
domain and path. If $override is true, then the
secure, domain and path restrictions of the cookies are ignored and all
cookies are essentially included.
Notice: cookie expiration is currently not implemented. URL
restriction comparision is also case-insensitive.
- cookie_get
- Params: $jar, $name
Return: @elements
Fetch the named cookie from the $jar,
and return the components. The returned items will be an array in the
following order:
value, domain, path, expire, secure
value = cookie value, should always be non-empty string domain
= domain root for cookie, can be undefined path = URL path for cookie,
should always be a non-empty string expire = undefined (depreciated, but
exists for backwards-compatibility) secure = whether or not the cookie
is limited to HTTPs; value is 0 or 1
- cookie_get_names
- Params: $jar
Return: @names
Fetch all the cookie names from the jar, which then let you
cooke_get() them individually.
- cookie_get_valid_names
- Params: $jar, $domain,
$url, $ssl
Return: @names
Fetch all the cookie names from the jar which are valid for
the given $domain, $url,
and $ssl values. $domain
should be string scalar of the target host domain ('www.example.com',
etc.). $url should be the absolute URL for the
page ('/index.html', '/cgi-bin/foo.cgi', etc.).
$ssl should be 0 for non-secure cookies, or 1
for all (secure and normal) cookies. The return value is an array of
names compatible with cookie_get().
- cookie_set
- Params: $jar, $name,
$value, $domain,
$path, $expire,
$secure
Return: nothing
Set the named cookie with the provided values into the
%jar. $name is required
to be a non-empty string. $value is required,
and will delete the named cookie from the $jar
if it is an empty string. $domain and
$path can be strings or undefined.
$expire is ignored (but exists for
backwards-compatibility). $secure should be the
numeric value of 0 or 1.
- crawl_new
- Params: $START,
$MAX_DEPTH, \%request_hash [, \%tracking_hash ]
Return: $crawl_object
The crawl_new() functions initializes a crawl object
(hash) to the default values, and then returns it for later use by
crawl(). $START is the starting URL (in
the form of 'http://www.host.com/url'), and MAX_DEPTH is the maximum
number of levels to crawl (the START URL counts as 1, so a value of 2
will crawl the START URL and all URLs found on that page). The
request_hash is a standard initialized request hash to be used for
requests; you should set any authentication information or headers in
this hash in order for the crawler to use them. The optional
tracking_hash lets you supply a hash for use in tracking URL results
(otherwise crawl_new() will allocate a new anon hash).
- crawl
- Params: $crawl_object [,
$START, $MAX_DEPTH ]
Return: $count [ undef on error ]
The heart of the crawl package. Will perform an HTTP crawl on
the specified HOST, starting at START URI, proceeding up to
MAX_DEPTH.
Crawl_object needs to be the variable returned by
crawl_new(). You can also indirectly call crawl() via the
crawl_object itself:
$crawl_object->{crawl}->($START,$MAX_DEPTH)
Returns the number of URLs actually crawled (not including
those skipped).
- dump
- Params: $name, \@array [,
$name, \%hash, $name,
\$scalar ]
Return: $code [ undef on error ]
The dump function will take the given
$name and data reference, and will create an
ASCII perl code representation suitable for eval'ing later to recreate
the same structure. $name is the name of the
variable that it will be saved as. Example:
$output = LW2::dump('request',\%request);
NOTE: dump() creates anonymous structures under the
name given. For example, if you dump the hash
%hin under the name 'hin', then when you eval
the dumped code you will need to use %$hin, since
$hin is now a *reference* to a hash.
- dump_writefile
- Params: $file, $name,
\@array [, $name, \%hash,
$name, \@scalar ]
Return: 0 if success; 1 if error
This calls dump() and saves the output to the specified
$file.
Note: LW does not checking on the validity of the file name,
it's creation, or anything of the sort. Files are opened in overwrite
mode.
- encode_base64
- Params: $data [, $eol]
Return: $b64_encoded_data
This function does Base64 encoding. If the binary MIME::Base64
module is available, it will use that; otherwise, it falls back to an
internal perl version. The perl version carries the following
copyright:
Copyright 1995-1999 Gisle Aas <gisle@aas.no>
NOTE: the $eol parameter will be
inserted every 76 characters. This is used to format the data for output
on a 80 character wide terminal.
- decode_base64
- Params: $data
Return: $b64_decoded_data
A perl implementation of base64 decoding. The perl code for
this function was actually taken from an older MIME::Base64 perl module,
and bears the following copyright:
Copyright 1995-1999 Gisle Aas <gisle@aas.no>
- encode_uri_hex
- Params: $data
Return: $result
This function encodes every character (except the / character)
with normal URL hex encoding.
- encode_uri_randomhex
- Params: $data
Return: $result
This function randomly encodes characters (except the /
character) with normal URL hex encoding.
- encode_uri_randomcase
- Params: $data
Return: $result
This function randomly changes the case of characters in the
string.
- encode_unicode
- Params: $data
Return: $result
This function converts a normal string into Windows unicode
format (non-overlong or anything fancy).
- decode_unicode
- Params: $unicode_string
Return: $decoded_string
This function attempts to decode a unicode (UTF-8) string by
converting it into a single-byte-character string. Overlong characters
are converted to their standard characters in place; non-overlong (aka
multi-byte) characters are substituted with the 0xff; invalid encoding
characters are left as-is.
Note: this function is useful for dealing with the various
unicode exploits/vulnerabilities found in web servers; it is *not* good
for doing actual UTF-8 parsing, since characters over a single byte are
basically dropped/replaced with a placeholder.
- encode_anti_ids
- Params: \%request, $modes
Return: nothing
encode_anti_ids computes the proper anti-ids encoding/tricks
specified by $modes, and sets up
%hin in order to use those tricks. Valid modes
are (the mode numbers are the same as those found in whisker 1.4):
- 1 Encode some of the characters via normal URL encoding
- 2 Insert directory self-references (/./)
- 3 Premature URL ending (make it appear the request line is done)
- 4 Prepend a long random string in the form of
"/string/../URL"
- 5 Add a fake URL parameter
- 6 Use a tab instead of a space as a request spacer
- 7 Change the case of the URL (works against Windows and Novell)
- 8 Change normal seperators ('/') to Windows version ('\')
- 9 Session splicing [NOTE: not currently available]
- A Use a carriage return (0x0d) as a request spacer
- B Use binary value 0x0b as a request spacer
You can set multiple modes by setting the string to contain all
the modes desired; i.e. $modes="146" will
use modes 1, 4, and 6.
- FORMS FUNCTIONS
- The goal is to parse the variable, human-readable HTML into concrete
structures useable by your program. The forms functions does do a good job
at making these structures, but I will admit: they are not exactly simple,
and thus not a cinch to work with. But then again, representing something
as complex as a HTML form is not a simple thing either. I think the
results are acceptable for what's trying to be done. Anyways...
Forms are stored in perl hashes, with elements in the
following format:
$form{'element_name'}=@([ 'type', 'value', @params ])
Thus every element in the hash is an array of anonymous
arrays. The first array value contains the element type (which is
'select', 'textarea', 'button', or an 'input' value of the form
'input-text', 'input-hidden', 'input-radio', etc).
The second value is the value, if applicable (it could be
undef if no value was specified). Note that select elements will always
have an undef value--the actual values are in the subsequent options
elements.
The third value, if defined, is an anonymous array of
additional tag parameters found in the element (like
'onchange="blah"', 'size="20"',
'maxlength="40"', 'selected', etc).
The array does contain one special element, which is stored in
the hash under a NULL character ("\0") key. This element is of
the format:
$form{"\0"}=['name', 'method', 'action', @parameters];
The element is an anonymous array that contains strings of the
form's name, method, and action (values can be undef), and a
@parameters array similar to that found in
normal elements (above).
Accessing individual values stored in the form hash becomes a
test of your perl referencing skills. Hint: to access the 'value' of the
third element named 'choices', you would need to do:
$form{'choices'}->[2]->[1];
The '[2]' is the third element (normal array starts with 0),
and the actual value is '[1]' (the type is '[0]', and the parameter
array is '[2]').
- forms_read
- Params: \$html_data
Return: \@found_forms
This function parses the given
$html_data into libwhisker form hashes. It
returns a reference to an array of hash references to the found
forms.
- forms_write
- Params: \%form_hash
Return: $html_of_form [undef on
error]
This function will take the given
%form hash and compose a generic HTML
representation of it, formatted with tabs and newlines in order to make
it neat and tidy for printing.
Note: this function does *not* escape any special characters
that were embedded in the element values.
- html_find_tags
- Params: \$data, \&callback_function [,
$xml_flag, $funcref,
\%tag_map]
Return: nothing
html_find_tags parses a piece of HTML and 'extracts' all found
tags, passing the info to the given callback function. The callback
function must accept two parameters: the current tag (as a scalar), and
a hash ref of all the tag's elements. For example, the tag <a
href="/file"> will pass 'a' as the current tag, and a hash
reference which contains {'href'=>"/file"}.
The xml_flag, when set, causes the parser to do some extra
processing and checks to accomodate XML style tags such as <tag
foo="bar"/>.
The optional %tagmap is a hash of
lowercase tag names. If a tagmap is supplied, then the parser will only
call the callback function if the tag name exists in the tagmap.
The optional $funcref variable is
passed straight to the callback function, allowing you to pass flags or
references to more complex structures to your callback function.
- html_find_tags_rewrite
- Params: $position,
$length, $replacement
Return: nothing
html_find_tags_rewrite() is used to 'rewrite' an HTML
stream from within an html_find_tags() callback function. In
general, you can think of html_find_tags_rewrite working as:
substr(DATA, $position,
$length) =
$replacement
Where DATA is the current HTML string the html parser is
using. The reason you need to use this function and not substr()
is because a few internal parser pointers and counters need to be
adjusted to accomodate the changes.
If you want to remove a piece of the string, just set the
replacement to an empty string (''). If you wish to insert a string
instead of overwrite, just set $length to 0;
your string will be inserted at the indicated
$position.
- html_link_extractor
- Params: \$html_data
Return: @urls
The html_link_extractor() function uses the internal
crawl tests to extract all the HTML links from the given HTML data
stream.
Note: html_link_extractor() does not unique the
returned array of discovered links, nor does it attempt to remove
javascript links or make the links absolute. It just extracts every raw
link from the HTML stream and returns it. You'll have to do your own
post-processing.
- http_new_request
- Params: %parameters
Return: \%request_hash
This function basically 'objectifies' the creation of whisker
request hash objects. You would call it like:
$req = http_new_request( host=>'www.example.com', uri=>'/' )
where 'host' and 'uri' can be any number of {whisker} hash
control values (see http_init_request for default list).
- http_new_response
- Params: [none]
Return: \%response_hash
This function basically 'objectifies' the creation of whisker
response hash objects. You would call it like:
$resp = http_new_response()
- http_init_request
- Params: \%request_hash_to_initialize
Return: Nothing (modifies input hash)
Sets default values to the input hash for use. Sets the host
to 'localhost', port 80, request URI '/', using HTTP 1.1 with GET
method. The timeout is set to 10 seconds, no proxies are defined, and
all URI formatting is set to standard HTTP syntax. It also sets the
Connection (Keep-Alive) and User-Agent headers.
NOTICE!! It's important to use http_init_request before
calling http_do_request, or http_do_request might puke. Thus, a special
magic value is placed in the hash to let http_do_request know that the
hash has been properly initialized. If you really must 'roll your own'
and not use http_init_request before you call http_do_request, you will
at least need to set the MAGIC value (amongst other things).
- http_do_request
- Params: \%request, \%response [, \%configs]
Return: >=1 if error; 0 if no error (also modifies response
hash)
*THE* core function of libwhisker. http_do_request actually
performs the HTTP request, using the values submitted in
%request, and placing result values in
%response. This allows you to resubmit
%request in subsequent requests (%response is
automatically cleared upon execution). You can submit 'runtime' config
directives as %configs, which will be spliced
into $hin{whisker}->{} before anything else.
That means you can do:
LW2::http_do_request(\%req,\%resp,{'uri'=>'/cgi-bin/'});
This will set
$req{whisker}->{'uri'}='/cgi-bin/' before
execution, and provides a simple shortcut (note: it does modify
%req).
This function will also retry any requests that bomb out
during the transaction (but not during the connecting phase). This is
controlled by the {whisker}->{retry} value. Also note that the
returned error message in hout is the *last* error received. All retry
errors are put into {whisker}->{retry_errors}, which is an anonymous
array.
Also note that all NTLM auth logic is implemented in
http_do_request(). NTLM requires multiple requests in order to
work correctly, and so this function attempts to wrap that and make it
all transparent, so that the final end result is what's passed to the
application.
This function will return 0 on success, 1 on HTTP protocol
error, and 2 on non-recoverable network connection error (you can retry
error 1, but error 2 means that the server is totally unreachable and
there's no point in retrying).
- http_req2line
- Params: \%request, $uri_only_switch
Return: $request
req2line is used internally by http_do_request, as well as
provides a convienient way to turn a %request
configuration into an actual HTTP request line. If
$switch is set to 1, then the returned
$request will be the URI only
('/requested/page.html'), versus the entire HTTP request ('GET
/requested/page.html HTTP/1.0\n\n'). Also, if the
'full_request_override' whisker config variable is set in
%hin, then it will be returned instead of the
constructed URI.
- http_resp2line
- Params: \%response
Return: $response
http_resp2line provides a convienient way to turn a
%response hash back into the original HTTP
response line.
- http_fixup_request
- Params: $hash_ref
Return: Nothing
This function takes a %hin hash
reference and makes sure the proper headers exist (for example, it will
add the Host: header, calculate the Content-Length: header for POST
requests, etc). For standard requests (i.e. you want the request to be
HTTP RFC-compliant), you should call this function right before you call
http_do_request.
- http_reset
- Params: Nothing
Return: Nothing
The http_reset function will walk through the
%http_host_cache, closing all open sockets and
freeing SSL resources. It also clears out the host cache in case you
need to rerun everything fresh.
Note: if you just want to close a single connection, and you
have a copy of the %request hash you used, you
should use the http_close() function instead.
- ssl_is_available
- Params: Nothing
Return: $boolean [,
$lib_name, $version]
The ssl_is_available() function will inform you whether
SSL requests are allowed, which is dependant on whether the appropriate
SSL libraries are installed on the machine. In scalar context, the
function will return 1 or 0. In array context, the second element will
be the SSL library name that is currently being used by LW2, and the
third elment will be the SSL library version number. Elements two and
three (name and version) will be undefined if called in array context
and no SSL libraries are available.
- http_read_headers
- Params: $stream, \%in, \%out
Return: $result_code,
$encoding, $length,
$connection
Read HTTP headers from the given stream, storing the results
in %out. On success,
$result_code will be 1 and
$encoding, $length, and
$connection will hold the values of the
Transfer-Encoding, Content-Length, and Connection headers, respectively.
If any of those headers are not present, then it will have an 'undef'
value. On an error, the $result_code will be 0
and $encoding will contain an error message.
This function can be used to parse both request and response
headers.
Note: if there are multiple Transfer-Encoding, Content-Length,
or Connection headers, then only the last header value is the one
returned by the function.
- http_read_body
- Params: $stream, \%in, \%out,
$encoding, $length
Return: 1 on success, 0 on error (and sets
$hout->{whisker}->{error})
Read the body from the given stream, placing it in
$out->{whisker}->{data}. Handles chunked
encoding. Can be used to read HTTP (POST) request or HTTP response
bodies. $encoding parameter should be lowercase
encoding type.
NOTE: $out->{whisker}->{data} is
erased/cleared when this function is called, leaving {data} to just
contain this particular HTTP body.
- http_construct_headers
- Params: \%in
Return: $data
This function assembles the headers in the given hash into a
data string.
- http_close
- Params: \%request
Return: nothing
This function will close any open streams for the given
request.
Note: in order for http_close() to find the right
connection, all original host/proxy/port parameters in
%request must be the exact same as when the
original request was made.
- http_do_request_timeout
- Params: \%request, \%response, $timeout
Return: $result
This function is identical to http_do_request(), except
that it wraps the entire request in a timeout wrapper.
$timeout is the number of seconds to allow for
the entire request to be completed.
Note: this function uses alarm() and signals, and thus
will only work on Unix-ish platforms. It should be safe to call on any
platform though.
- md5
- Params: $data
Return: $hex_md5_string
This function takes a data scalar, and composes a MD5 hash of
it, and returns it in a hex ascii string. It will use the fastest MD5
function available.
- md4
- Params: $data
Return: $hex_md4_string
This function takes a data scalar, and composes a MD4 hash of
it, and returns it in a hex ascii string. It will use the fastest MD4
function available.
- multipart_set
- Params: \%multi_hash, $param_name,
$param_value
Return: nothing
This function sets the named parameter to the given value
within the supplied multipart hash.
- multipart_get
- Params: \%multi_hash, $param_name
Return: $param_value, undef on
error
This function retrieves the named parameter to the given value
within the supplied multipart hash. There is a special case where the
named parameter is actually a file--in which case the resulting value
will be "\0FILE". In general, all special values will be
prefixed with a NULL character. In order to get a file's info, use
multipart_getfile().
- multipart_setfile
- Params: \%multi_hash, $param_name,
$file_path [, $filename]
Return: undef on error, 1 on success
NOTE: this function does not actually add the contents of
$file_path into the
%multi_hash; instead, multipart_write()
inserts the content when generating the final request.
- multipart_getfile
- Params: \%multi_hash, $file_param_name
Return: $path,
$name ($path=undef on error)
multipart_getfile is used to retrieve information for a file
parameter contained in %multi_hash. To use this
you would most likely do:
($path,$fname)=LW2::multipart_getfile(\%multi,"param_name");
- multipart_boundary
- Params: \%multi_hash [, $new_boundary_name]
Return: $current_boundary_name
multipart_boundary is used to retrieve, and optionally set,
the multipart boundary used for the request.
NOTE: the function does no checking on the supplied boundary,
so if you want things to work make sure it's a legit boundary.
Libwhisker does *not* prefix it with any '---' characters.
- multipart_write
- Params: \%multi_hash, \%request
Return: 1 if successful, undef on error
multipart_write is used to parse and construct the multipart
data contained in %multi_hash, and place it
ready to go in the given whisker hash (%request) structure, to be sent
to the server.
NOTE: file contents are read into the final
%request, so it's possible for the hash to get
*very* large if you have (a) large file(s).
- multipart_read
- Params: \%multi_hash, \%hout_response [, $filepath
]
Return: 1 if successful, undef on error
multipart_read will parse the data contents of the supplied
%hout_response hash, by passing the appropriate
info to multipart_read_data(). Please see
multipart_read_data() for more info on parameters and
behaviour.
NOTE: this function will return an error if the given
%hout_response Content-Type is not set to
"multipart/form-data".
- multipart_read_data
- Params: \%multi_hash, \$data, $boundary [,
$filepath ]
Return: 1 if successful, undef on error
multipart_read_data parses the contents of the supplied data
using the given boundary and puts the values in the supplied
%multi_hash. Embedded files will *not* be saved
unless a $filepath is given, which should be a
directory suitable for writing out temporary files.
NOTE: currently only application/octet-stream is the only
supported file encoding. All other file encodings will not be
parsed/saved.
- multipart_files_list
- Params: \%multi_hash
Return: @files
multipart_files_list returns an array of parameter names for
all the files that are contained in
%multi_hash.
- multipart_params_list
- Params: \%multi_hash
Return: @params
multipart_files_list returns an array of parameter names for
all the regular parameters (non-file) that are contained in
%multi_hash.
- ntlm_new
- Params: $username,
$password [, $domain,
$ntlm_only]
Return: $ntlm_object
Returns a reference to an array (otherwise known as the 'ntlm
object') which contains the various informations specific to a user/pass
combo. If $ntlm_only is set to 1, then only the
NTLM hash (and not the LanMan hash) will be generated. This results in a
speed boost, and is typically fine for using against IIS servers.
The array contains the following items, in order: username,
password, domain, lmhash(password), ntlmhash(password)
- ntlm_decode_challenge
- Params: $challenge
Return: @challenge_parts
Splits the supplied challenge into the various parts. The
returned array contains elements in the following order:
unicode_domain, ident, packet_type, domain_len, domain_maxlen,
domain_offset, flags, challenge_token, reserved, empty, raw_data
- ntlm_client
- Params: $ntlm_obj [,
$server_challenge]
Return: $response
ntlm_client() is responsible for generating the
base64-encoded text you include in the HTTP Authorization header. If you
call ntlm_client() without a
$server_challenge, the function will return the
initial NTLM request packet (message packet #1). You send this to the
server, and take the server's response (message packet #2) and pass that
as $server_challenge, causing
ntlm_client() to generate the final response packet (message
packet #3).
Note: $server_challenge is expected to
be base64 encoded.
- get_page
- Params: $url [, \%request]
Return: $code,
$data ($code will be set to undef on error,
$data will contain error message)
This function will fetch the page at the given URL, and return
the HTTP response code and page contents. Use this in the form of:
($code,$html)=LW2::get_page("http://host.com/page.html")
The optional %request will be used if
supplied. This allows you to set headers and other parameters.
- get_page_hash
- Params: $url [, \%request]
Return: $hash_ref (undef on no
URL)
This function will fetch the page at the given URL, and return
the whisker HTTP response hash. The return code of the function is set
to $hash_ref->{whisker}->{get_page_hash},
and uses the http_do_request() return values.
Note: undef is returned if no URL is supplied
- get_page_to_file
- Params: $url, $filepath [,
\%request]
Return: $code ($code will be set to
undef on error)
This function will fetch the page at the given URL, place the
resulting HTML in the file specified, and return the HTTP response code.
The optional %request hash sets the default
parameters to be used in the request.
NOTE: libwhisker does not do any file checking; libwhisker
will open the supplied filepath for writing, overwriting any
previously-existing files. Libwhisker does not differentiate between a
bad request, and a bad file open. If you're having troubles making this
function work, make sure that your $filepath is
legal and valid, and that you have appropriate write permissions to
create/overwrite that file.
- time_mktime
- Params: $seconds,
$minutes, $hours,
$day_of_month, $month,
$year_minus_1900
Return: $seconds [ -1 on error ]
Performs a general mktime calculation with the given time
components. Note that the input parameter values are expected to be in
the format output by localtime/gmtime. Namely,
$seconds is 0-60 (yes, there can be a leap
second value of 60 occasionally), $minutes is
0-59, $hours is 0-23,
$days is 1-31, $month is
0-11, and $year is 70-127. This function is
limited in that it will not process dates prior to 1970 or after 2037
(that way 32-bit time_t overflow calculations aren't required).
Additional parameters passed to the function are ignored, so
it is safe to use the full localtime/gmtime output, such as:
$seconds = LW2::time_mktime( localtime( time ) );
Note: this function does not adjust for time zone, daylight
savings time, etc. You must do that yourself.
- time_gmtolocal
- Params: $seconds_gmt
Return: $seconds_local_timezone
Takes a seconds value in UTC/GMT time and adjusts it to
reflect the current timezone. This function is slightly expensive; it
takes the gmtime() and localtime() representations of the
current time, calculates the delta difference by turning them back into
seconds via time_mktime, and then applies this delta difference to
$seconds_gmt.
Note that if you give this function a time and subtract the
return value from the original time, you will get the delta value. At
that point, you can just apply the delta directly and skip calling this
function, which is a massive performance boost. However, this will cause
problems if you have a long running program which crosses daylight
savings time boundaries, as the DST adjustment will not be accounted for
unless you recalculate the new delta.
- uri_split
- Params: $uri_string [, \%request_hash]
Return: @uri_parts
Return an array of the following values, in order: uri,
protocol, host, port, params, frag, user, password. Values not defined
are given an undef value. If a %request hash is
passed in, then uri_split() will also set the appropriate values
in the hash.
Note: uri_split() will only set the
%request hash if the protocol is HTTP or
HTTPS!
- uri_join
- Params: @vals
Return: $url
Takes the @vals array output from
http_split_uri, and returns a single scalar/string with them joined
again, in the form of:
protocol://user:pass@host:port/uri?params#frag
- uri_absolute
- Params: $uri, $base_uri [,
$normalize_flag ]
Return: $absolute_uri
Double checks that the given $uri is
in absolute form (that is, "http://host/file"), and if not
(it's in the form "/file"), then it will append the given
$base_uri to make it absolute. This provides a
compatibility similar to that found in the URI subpackage.
If $normalize_flag is set to 1, then
the output will be passed through uri_normalize before being
returned.
- uri_normalize
- Params: $uri [,
$fix_windows_slashes ]
Return: $normalized_uri [ undef on
error ]
Takes the given $uri and does any /./
and /../ dereferencing in order to come up with the correct absolute
URL. If the $fix_ windows_slashes parameter is
set to 1, all \ (back slashes) will be converted to / (forward
slashes).
Non-http/https URIs return an error.
- uri_get_dir
- Params: $uri
Return: $uri_directory
Will take a URI and return the directory base of it, i.e.
/rfp/page.php will return /rfp/.
- uri_strip_path_parameters
- Params: $uri [, \%param_hash]
Return: $stripped_uri
This function removes all URI path parameters of the form
/blah1;foo=bar/blah2;baz
and returns the stripped URI ('/blah1/blah2'). If the optional
parameter hash reference is provided, the stripped parameters are saved
in the form of 'blah1'=>'foo=bar', 'blah2'=>'baz'.
Note: only the last value of a duplicate name is saved into
the param_hash, if provided. So a $uri of
'/foo;A/foo;B/' will result in a single hash entry of 'foo'=>'B'.
- uri_parse_parameters
- Params: $parameter_string [,
$decode, $multi_flag ]
Return: \%parameter_hash
This function takes a string in the form of:
foo=1&bar=2&baz=3&foo=4
And parses it into a hash. In the above example, the element
'foo' has two values (1 and 4). If $multi_flag
is set to 1, then the 'foo' hash entry will hold an anonymous array of
both values. Otherwise, the default is to just contain the last value
(in this case, '4').
If $decode is set to 1, then normal
hex decoding is done on the characters, where needed (both the name and
value are decoded).
Note: if a URL parameter name appears without a value, then
the value will be set to undef. E.g. for the string
"foo=1&bar&baz=2", the 'bar' hash element will have an
undef value.
- uri_escape
- Params: $data
Return: $encoded_data
This function encodes the given $data
so it is safe to be used in URIs.
- uri_unescape
- Params: $encoded_data
Return: $data
This function decodes the given $data
out of URI format.
- utils_recperm
- Params: $uri, $depth,
\@dir_parts, \@valid, \&func, \%track, \%arrays, \&cfunc
Return: nothing
This is a special function which is used to
recursively-permutate through a given directory listing. This is really
only used by whisker, in order to traverse down directories, testing
them as it goes. See whisker 2.0 for exact usage examples.
- utils_array_shuffle
- Params: \@array
Return: nothing
This function will randomize the order of the elements in the
given array.
- utils_randstr
- Params: [ $size, $chars ]
Return: $random_string
This function generates a random string between 10 and 20
characters long, or of $size if specified. If
$chars is specified, then the random function
picks characters from the supplied string. For example, to have a random
string of 10 characters, composed of only the characters 'abcdef', then
you would run:
utils_randstr(10,'abcdef');
The default character string is alphanumeric.
- utils_port_open
- Params: $host, $port
Return: $result
Quick function to attempt to make a connection to the given
host and port. If a connection was successfully made, function will
return true (1). Otherwise it returns false (0).
Note: this uses standard TCP connections, thus is not
recommended for use in port-scanning type applications. Extremely
slow.
- utils_lowercase_keys
- Params: \%hash
Return: $number_changed
Will lowercase all the header names (but not values) of the
given hash.
- utils_find_lowercase_key
- Params: \%hash, $key
Return: $value, undef on error or not
exist
Searches the given hash for the $key
(regardless of case), and returns the value. If the return value is
placed into an array, the will dereference any multi-value references
and return an array of all values.
WARNING! In scalar context, $value can
either be a single-value scalar or an array reference for multiple
scalar values. That means you either need to check the return value and
act appropriately, or use an array context (even if you only want a
single value). This is very important, even if you know there are no
multi-value hash keys. This function may still return an array of
multiple values even if all hash keys are single value, since
lowercasing the keys could result in multiple keys matching. For
example, a hash with the values { 'Foo'=>'a', 'fOo'=>'b' }
technically has two keys with the lowercase name 'foo', and so this
function will either return an array or array reference with both 'a'
and 'b'.
- utils_find_key
- Params: \%hash, $key
Return: $value, undef on error or not
exist
Searches the given hash for the $key
(case-sensitive), and returns the value. If the return value is placed
into an array, the will dereference any multi-value references and
return an array of all values.
- utils_delete_lowercase_key
- Params: \%hash, $key
Return: $number_found
Searches the given hash for the $key
(regardless of case), and deletes the key out of the hash if found. The
function returns the number of keys found and deleted (since multiple
keys can exist under the names 'Key', 'key', 'keY', 'KEY', etc.).
- utils_getline
- Params: \$data [, $resetpos ]
Return: $line (undef if no more
data)
Fetches the next \n terminated line from the given data. Use
the optional $resetpos to reset the internal
position pointer. Does *NOT* return trialing \n.
- utils_getline_crlf
- Params: \$data [, $resetpos ]
Return: $line (undef if no more
data)
Fetches the next \r\n terminated line from the given data. Use
the optional $resetpos to reset the internal
position pointer. Does *NOT* return trialing \r\n.
- utils_save_page
- Params: $file, \%response
Return: 0 on success, 1 on error
Saves the data portion of the given whisker
%response hash to the indicated file. Can
technically save the data portion of a %request
hash too. A file is not written if there is no data.
Note: LW does not do any special file checking; files are
opened in overwrite mode.
- utils_getopts
- Params: $opt_str, \%opt_results
Return: 0 on success, 1 on error
This function is a general implementation of GetOpts::Std. It
will parse @ARGV, looking for the options
specified in $opt_str, and will put the results
in %opt_results. Behavior/parameter values are
similar to GetOpts::Std's getopts().
Note: this function does *not* support long options
(--option), option grouping (-opq), or options with immediate values
(-ovalue). If an option is indicated as having a value, it will take the
next argument regardless.
- utils_text_wrapper
- Params: $long_text_string [,
$crlf, $width ]
Return: $formatted_test_string
This is a simple function used to format a long line of text
for display on a typical limited-character screen, such as a unix shell
console.
$crlf defaults to "\n", and
$width defaults to 76.
- utils_bruteurl
- Params: \%req, $pre,
$post, \@values_in, \@values_out
Return: Nothing (adds to @out)
Bruteurl will perform a brute force against the host/server
specified in %req. However, it will make one
request per entry in @in, taking the value and
setting $hin{'whisker'}->{'uri'}=
$pre.value.$post. Any URI responding with an
HTTP 200 or 403 response is pushed into @out. An
example of this would be to brute force usernames, putting a list of
common usernames in @in, setting
$pre='/~' and
$post='/'.
- utils_join_tag
- Params: $tag_name, \%attributes
Return: $tag_string [undef on
error]
This function takes the $tag_name
(like 'A') and a hash full of attributes (like {href=>'http://foo/'})
and returns the constructed HTML tag string (<A
href="http://foo">).
- utils_request_clone
- Params: \%from_request, \%to_request
Return: 1 on success, 0 on error
This function takes the connection/request-specific values
from the given from_request hash, and copies them to the to_request
hash.
- utils_request_fingerprint
- Params: \%request [, $hash ]
Return: $fingerprint [undef on
error]
This function constructs a 'fingerprint' of the given request
by using a cryptographic hashing function on the constructed original
HTTP request.
Note: $hash can be 'md5' (default) or
'md4'.
- utils_flatten_lwhash
- Params: \%lwhash
Return: $flat_version [undef on
error]
This function takes a %request or
%response libwhisker hash, and creates an
approximate flat data string of the original request/ response (i.e.
before it was parsed into components and placed into the libwhisker
hash).
- utils_carp
- Params: [ $package_name ]
Return: nothing
This function acts like Carp's carp function. It warn's with
the file and line number of user's code which causes a problem. It
traces up the call stack and reports the first function that is not in
the LW2 or optional $package_name package
package.
- utils_croak
- Params: [ $package_name ]
Return: nothing
This function acts like Carp's croak function. It die's with
the file and line number of user's code which causes a problem. It
traces up the call stack and reports the first function that is not in
the LW2 or optional $package_name package
package.
Copyright 2009 Jeff Forristal
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |