|
|
| |
Finance::QuoteHist::Generic(3) |
User Contributed Perl Documentation |
Finance::QuoteHist::Generic(3) |
Finance::QuoteHist::Generic - Base class for retrieving historical stock quotes.
package Finance::QuoteHist::MyFavoriteSite;
use strict;
use vars qw(@ISA);
use Finance::QuoteHist::Generic;
@ISA = qw(Finance::QuoteHist::Generic);
sub url_maker {
# This method returns a code reference for a routine that, upon
# repeated invocation, will provide however many URLs are necessary
# to fully obtain the historical data for a given target mode and
# parsing mode.
}
This is the base class for retrieving historical stock quotes. It is built
around LWP::UserAgent. Page results are currently parsed as either CSV or HTML
tables.
In order to actually retrieve historical stock quotes, this class
should be subclassed and tailored to a particular web site. In particular,
the "url_maker()" factory method should be
overridden, which provides a code reference to a routine that provides
however many URLs are necessary to retrieve the data over a list of symbols
within the given date range, for a particular target (quotes, dividends,
splits). Different sites have different formats and different limitations on
how many quotes are returned for each query. See Finance::QuoteHist::Yahoo
for an example of how to do this.
For more complicated sites, such as Yahoo, overriding additional
methods might be necessary for dealing with things such as splits and
dividends.
- new()
- Returns a new Finance::QuoteHist::Generic object. Valid attributes
are:
- start_date
- end_date
- Specify the date range from which you would like historical quotes. These
dates get parsed by the "ParseDate()"
method in Date::Manip, so see Date::Manip(3) for more information
on valid date strings. They are quite flexible, and include such strings
as '1 year ago'. Date boundaries can also be dynamically set with methods
of the same name. The absence of a start date means go to the beginning of
the history. The absence of an end date means go up to the most recent
historical date. The absence of both means grab everything.
- symbols
- Indicates which ticker symbols to include in the search for historical
quotes. Passed either as a string (for single ticker) or an array ref for
multiple tickers.
- granularity
- Returns rows at 'daily', 'weekly', or 'monthly' levels of granularity.
Defaults to 'daily'.
- attempts
- Sets how persistently the module tries to retrieve the quotes. There are
two places this will manifest. First, if there are what appear to be
network errors, this many network connections are attempted for that URL.
Secondly, for quotes only, if pages were successfully retrieved, but they
contained no quotes, this number of attempts are made to retrieve a
document with data. Sometimes sites will report a temporary internal error
via HTML, and if it is truly transitory this will usually get around it.
The default is 3.
- lineup
- Passed as an array reference (or scalar for single class), this list
indicates which Finance::QuoteHist::Generic sub classes should be invoked
in the event this class fails in its attempt to retrieve historical
quotes. In the event of failure, the first class in this list is invoked
with the same parameters as the original class, and the remaining classes
are passed as the lineup to the new class. This sets up a daisy chain of
redundancy in the event a particular site is hosed. See
Finance::QuoteHist(3) to see an example of how this is done in a
top level invocation of these modules. This list is empty by default.
- quote_precision
- Sets the number of decimal places to which quote values are rounded. This
might be of particular significance if there is auto-adjustment taking
place (which is only under particular circumstances currently...see
Finance::QuoteHist::Yahoo). Setting this to 0 will disable the rounding
behavior, returning the quote values as they appear on the sites (assuming
no auto-adjustment has taken place). The default is 4.
- row_filter
- When provided a subroutine reference, the routine is invoked with an array
reference for each raw row retrieved from the quote source. This allows
user-defined filtering or formatting for the items of each row. This
routine is invoked before any built-in routines are called on the row. The
array must be modified directly rather than returned as a value. Use
sparingly since the built-in filtering and normalizing routines do expect
each row to more or less look like historical stock data. Rearranging the
order of the columns in each row is contraindicated.
- env_proxy
- When set, instructs the underlying LWP::UserAgent to load proxy
configuration information from environment variables. See the
"ua()" method and LWP::UserAgent for
more information.
- auto_proxy
- Same as env_proxy, but tests first to see if
$ENV{http_proxy} is present.
- verbose
- When set, many status messages are printed to STDERR indicating
progression through URLs and lineup invocations.
- quiet
- When set, certain failure messages are suppressed from appearing on
STDERR. These messages would normally appear regardless the setting of the
"verbose" flag.
The following methods are the primary user interface methods;
methods of interest to developers wishing to make their own site-specific
instance of this module will find information on overriding methods further
below.
- quotes()
- Retrieves historical quotes for all provided symbols over the specified
date range. Depending on context, returns either a list of rows or an
array reference to the same list of rows.
- dividends()
- splits()
- If available, retrieves dividend or split information for all provided
symbols over the specified date range. If there are no site-specific
subclassed modules in the lineup capable of getting dividends or
splits, the user will be notified on STDERR unless the quiet flag
was requested during object creation.
- start_date(date_string)
- end_date(date_string)
- Set the date boundaries of all queries. The date_string is
interpreted by the Date::Manip module. The absence of a start date means
retrieve back to the beginning of that ticker's history. The absence of an
end date means retrieve up to the latest date in the history.
- clear_cache()
- When results are gathered for a particular date range, whether they be via
direct query or incidental extraction, they are cached. This cache is
cleared by invoking this method directly, by resetting the boundary dates
of the query, or by changing the
"adjusted()" setting.
- quote_source(ticker_symbol)
- dividend_source(ticker_symbol)
- split_source(ticker_symbol)
- After query, these methods can be used to find out which particular
subclass in the lineup fulfilled the corresponding request for a
particular ticker symbol.
The following methods are the primary methods of interest for
developers wishing to make a site-specific subclass. The url_maker()
factory is typically all that is necessary.
- url_maker()
- Returns a subroutine reference that serves as an iterrator for producing
URLs based on target and parse mode. Repeated calls to this routine
produce subsequent URLs in the sequence.
- extractors()
- For a particular target mode and parse mode, returns a hash containing
code references to extraction routines for the remaining targets. For
example, for target 'quote' in parse mode 'html' there might be extractor
routines for both 'dividend' and 'split'.
- ua()
- Accessor method for the LWP::UserAgent object used to process
HTTP::Request for individual URLs. This can be handy for such things as
configuring proxy access for the underlying user agent. Example:
# Manual configuration
$qh1->ua->proxy(['http'], 'http://proxy.sn.no:8001/');
# Load from environment variables
$qh2->ua->env_proxy();
See LWP::UserAgent for more information on the capabilities of
that module.
The following are potentially useful for calling within methods
overridden above:
- parse_mode($parse_mode)
- Set the current parsing mode. Currently parsers are available for html and
csv.
- target_mode($target_mode)
- Return the current target mode.
- dates($start_date, $end_date)
- Returns a list of business days between and including the provided
boundary dates. If no arguments are provided, start_date and
end_date default to the currently specified date range.
- labels(%parms)
- Used to override the default labels for a given target mode and parse
mode. Takes the following named parameters:
- target_mode
- Can currently be 'quote', 'dividend', or 'split'. Default is 'quote'.
- parse mode
- Can currently be 'csv' or 'html'. The default is typically 'csv' but might
vary depending on the quote source.
- labels
- The following are the default labels. Text entries convert to case-
insensitive regular expressions):
target_mode
-------------------------------------------------------
quote => ['date','open','high','low','close',qr(vol|shares)i]
dividend => ['date','div']
split => ['date','post','pre']
The data returned from these modules is in no way guaranteed, nor are the
developers responsible in any way for how this data (or lack thereof) is used.
The interface is based on URLs and page layouts that might change at any time.
Even though these modules are designed to be adaptive under these
circumstances, they will at some point probably be unable to retrieve data
unless fixed or provided with new parameters. Furthermore, the data from these
web sites is usually not even guaranteed by the web sites themselves, and
oftentimes is acquired elsewhere. See the documentation for each site-specific
module for more information regarding the disclaimer for that site.
Above all, play nice.
Matthew P. Sisk, <sisk@mojotoad.com>
Copyright (c) 2000-2021 Matthew P. Sisk. All rights reserved. All wrongs
revenged. This program is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.
Finance::QuoteHist(3), HTML::TableExtract(3),
Date::Manip(3), perlmodlib(1), perl(1).
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |