|
|
| |
Astro::SIMBAD::Client(3) |
User Contributed Perl Documentation |
Astro::SIMBAD::Client(3) |
Astro::SIMBAD::Client - Fetch astronomical data from SIMBAD 4.
use Astro::SIMBAD::Client;
my $simbad = Astro::SIMBAD::Client->new ();
print $simbad->query (id => 'Arcturus');
As of release 0.027_01 the SOAP interface is deprecated. The University of
Strasbourg has announced at
<https://cds.u-strasbg.fr/resources.gml?id=soap> that this interface
will not be maintained after April 1 2014, and that this interface will be
stopped on December 31 2018.
Because the SOAP interface is still sort of functional (except for
VO-format queries) as of June 4 2014, I have revised the transition plan
announced with the release of 0.027_01 on October 28 2014.
What I have done as of version 0.031_01 is to add attribute
"emulate_soap_queries". This was false by
default. If this attribute is true, the
"query()" method and friends, instead of
issuing a SOAP request to the SIMBAD server, will instead construct an
equivalent script query, and issue that. The deprecation warning will not be
issued if "emulate_soap_queries" is true,
since the SOAP interface is not being used.
As of March 22 2021, SOAP queries started returning 404. Because
of this, I have made the default of
"emulate_soap_queries" true. Well,
actually I have made it the Boolean inverse of environment variable
ASTRO_SIMBAD_CLIENT_USE_SOAP. This is mostly for my benefit, so I can see if
SOAP has come back.
If SOAP still has not come back after six months, SOAP queries
will become fatal, as will setting
"emulate_soap_queries" to a false
value.
Eventually the SOAP code will be removed. In the meantime all
tests are skipped unless
"ASTRO_SIMBAD_CLIENT_USE_SOAP" is true,
and are marked TODO. Support of SOAP by this module will be on a best-effort
basis; that is, if I can make it work without a huge amount of work I will
-- otherwise SOAP will become unsupported.
This package implements several query interfaces to version 4 of the SIMBAD
on-line astronomical database, as documented at
<http://simbad.u-strasbg.fr/simbad4.htx>. This package will not
work with SIMBAD version 3. Its primary purpose is to obtain SIMBAD
data, though some rudimentary parsing functionality also exists.
There are three ways to access these data.
- URL queries are essentially page scrapers, but their use is
documented, and output is available as HTML, text, or VOTable. URL queries
are implemented by the url_query() method.
- Scripts may be submitted using the script() or
script_file() methods. The former takes as its argument the text of
the script, the latter takes a file name.
- Queries may be made using the web services (SOAP) interface. The
query() method implements this, and queryObjectByBib,
queryObjectByCoord, and queryObjectById have been provided as convenience
methods. As of version 0.027_01, SOAP queries are deprecated. See the NOTICE
section above for the deprecation schedule.
Astro::SIMBAD::Client is object-oriented, with the object
supplying not only the URL scheme and SIMBAD server name, but the default
format and output type for URL and web service queries.
A simple command line client application is also provided, as are
various examples in the eg directory.
The following methods should be considered public:
- $simbad = Astro::SIMBAD::Client->new ();
- This method instantiates an Astro::SIMBAD::Client object. Any arguments
will be passed to the set() method once the object is
instantiated.
- $string = $simbad->agent ();
- This method retrieves the user agent string used to identify this package
in queries to SIMBAD. This string will be the default string for
LWP::UserAgent, with this package name and version number appended in
parentheses. This method is exposed for the curious.
- @attribs = $simbad->attributes ();
- This method retrieves the names of all public attributes, in alphabetical
order. It can be called as a static method, or even as a subroutine.
- $value = $simbad->get ($attrib);
- This method retrieves the current value of the named attribute. It can be
called as a static method to retrieve the default value.
- $result = Parse_TXT_Simple ($data);
- This subroutine (not method) parses the given text data under the
assumption that it was generated using FORMAT_TXT_SIMPLE_BASIC or
something similar. The data is expected to be formatted as follows:
A line consisting of exactly '---' separates objects.
Data appear on lines that look like
name: data
and are parsed into a hash keyed by the given name. If the
line ends with a comma, it is assumed to contain multiple items, and the
data portion of the line is split on the commas; the resultant hash
value is a list reference.
The user would normally not call this directly, but specify it
as the parser for 'txt'-type queries:
$simbad->set (parser => {txt => 'Parse_TXT_Simple'});
- $result = Parse_VO_Table ($data);
- This subroutine (not method) parses the given VOTable data,
returning a list of anonymous hashes describing the data. The
$data value is split on '<?xml' before parsing,
so that you get multiple VOTables back (rather than a parse error) if that
is what the input contains.
This is not a full-grown VOTable parser capable of
handling the full spec (see
<https://www.ivoa.net/documents/latest/VOT.html>). It is oriented
toward returning <TABLEDATA> contents, and the metadata that can
reasonably be associated with those contents.
NOTE that as of version 0.026_01, the requisite modules
to support VO format are not required. If you need VO format you
will need to install XML::Parser or XML::Parser::Lite
The return is a list of anonymous hashes, one per
<TABLE>. Each hash contains two keys:
{data} is the data contained in the table, and
{meta} is the metadata for the table.
The {meta} element for the table is a reference to a list of
data gathered from the <TABLE> tag. Element zero is the tag name
('TABLE'), and element 1 is a reference to a hash containing the
attributes of the <TABLE> tag. Subsequent elements if any
represent metadata tags in the order encountered in the parse.
The {data} contains an anonymous list, each element of which
is a row of data from the <TABLEDATA> element of the
<TABLE>, in the order encountered by the parse. Each row is a
reference to a list of anonymous hashes, which represent the individual
data of the row, in the order encountered by the parse. The data hashes
contain two keys:
{value} is the value of the datum with undef for '~', and
{meta} is a reference to the metadata for the datum.
The {meta} element for a datum is a reference to the metadata
tag that describes that datum. This will be an anonymous list, of which
element 0 is the tag ('FIELD'), element 1 is a reference to a hash
containing that tag's attributes, and subsequent elements will be the
contents of the tag (typically including a reference to the list
representing the <DESCRIPTION> tag for that FIELD).
All values are returned as provided by the XML parser; no
further decoding is done. Specifically, the datatype and arraysize
attributes are ignored.
This parser is based on XML::Parser.
The user would normally not call this directly, but specify it
as the parser for 'vo'-type queries:
$simbad->set (parser => {vo => 'Parse_VO_Table'});
- $result = $simbad->query ($query => @args);
- This method is deprecated, and will cease to work in April 2014.
Please choose a method that does not use SOAP. See the NOTICE above for
details.
This method issues a web services (SOAP) query to the SIMBAD
database. The $query specifies a SIMBAD query
method, and the @args are the arguments for that
method. Valid $query values and the
corresponding SIMBAD methods and arguments are:
bib => queryObjectByBib ($bibcode, $format, $type)
coo => queryObjectByCoord ($coord, $radius, $format, $type)
id => queryObjectById ($id, $format, $type)
where:
$bibcode is a SIMBAD bibliographic code
$coord is a set of coordinates
$radius is an angular radius around the coordinates
$type is the type of data to be returned
$format is a format appropriate to the data type.
The $type defaults to the value of the
type attribute, and the $format defaults to the
value of the format attribute for the given
$type.
The return value depends on a number of factors:
If the query found nothing, you get undef in scalar context,
and an empty list in list context.
If a parser is defined for the given type, the returned data
will be fed to the parser, and the output of the parser will be
returned. This is assumed to be a list, so a reference to the list will
be used in scalar context. Parser exceptions are not trapped, so the
caller will need to be prepared to deal with malformed data.
Otherwise, the result of the query is returned as-is.
NOTE that this functionality makes use of the
SOAP::Lite module. As of version 0.026_01 of
"Astro::SIMBAD::Client", SOAP::Lite is
not a prerequisite of this module. If you wish to use the
"query()" method, you will have to
install SOAP::Lite separately. This can be done after
"Astro::SIMBAD::Client" is
installed.
- $value = $simbad->queryObjectByBib ($bibcode, $format, $type);
- This method is deprecated, and will cease to work on December 31
2018. Please choose a method that does not use SOAP. See the NOTICE above
for details.
This method is just a convenience wrapper for
$value = $simbad->query (bib => $bibcode, $format, $type);
See the query() documentation for more information.
- $value = $simbad->queryObjectByCoord ($coord, $radius, $format,
$type);
- This method is deprecated, and will cease to work on December 31
2018. Please choose a method that does not use SOAP. See the NOTICE above
for details.
This method is just a convenience wrapper for
$value = $simbad->query (coo => $coord, $radius, $format, $type);
See the query() documentation for more information.
- $value = $simbad->queryObjectById ($id, $format, $type);
- This method is deprecated, and will cease to work on December 31
2018. Please choose a method that does not use SOAP. See the NOTICE above
for details.
This method is just a convenience wrapper for
$value = $simbad->query (id => $id, $format, $type);
See the query() documentation for more information.
- $release = $simbad->release ();
- This method returns the current SIMBAD4 release, as scraped from the
top-level web page. This will look something like 'SIMBAD4 1.045 -
27-Jul-2007'
If called in list context, it returns ($major,
$minor, $point,
$patch, $date). The
returned information corresponding to the scalar example above is:
$major => 4
$minor => 1
$point => 45
$patch => ''
$date => '27-Jul-2007'
The $patch will usually be empty, but
occasionally you get something like release '1.019a', in which case
$patch would be 'a'.
Please note that this method is not based on a
published interface, but is simply a web page scraper, and subject to
all the problems such software is heir to. What the algorithm attempts
to do is to find (and parse, if called in list context) the contents of
the next <td> after 'Release:' (case-insensitive).
- $value = $simbad->script ($script);
- This method submits the given script to SIMBAD4. The
$script variable contains the text if the script;
if you want to submit a script file by name, use the script_file()
method.
If the verbatim attribute is false, the front matter of the
result (up to and including the '::data:::::' line) is stripped. If
there is no '::data:::::' line, the entire script output is raised as an
exception.
If a 'script' parser was specified, the output of the script
(after stripping front matter if that was specified) is passed to it.
The parser is presumed to return a list, so if script() was
called in scalar context you get a reference to that list back.
If no 'script' parser is specified, the output of the script
(after stripping front matter if that was specified) is simply returned
to the caller.
- $value = $simbad->script_file ($filename);
- This method submits the given script file to SIMBAD, returning the result
of the script. Unlike script(), the argument is the name of the
file containing the script, not the text of the script. However, if a
parser for 'script' has been specified, it will be applied to the
output.
- $simbad->set ($name => $value ...);
- This method sets the value of the given attributes. More than one
name/value pair may be specified. If called as a static method, it sets
the default value of the attribute.
- $value = $simbad->url_query ($type => ...)
- This method performs a query by URL, returning the results. The type is
one of:
id = query by identifier,
coo = query by coordinates,
ref = query by references,
sam = query by criteria.
The arguments depend on on the type, and are documented at
<http://simbad.u-strasbg.fr/guide/sim-url.htx>. They are specified
as name => value. For example:
$simbad->url_query (id =>
Ident => 'Arcturus',
NbIdent => 1
);
Note that in an id query you must specify 'Ident' explicitly.
This is true in general, because it is not always possible to derive the
first argument name from the query type, and consistency was chosen over
brevity.
The output.format argument can be defaulted based on the
object's type setting as follows:
txt becomes 'ASCII',
vo becomes 'VOTable'.
Any other value is passed verbatim.
If the query succeeds, the results will be passed to the
appropriate parser if any. The reverse of the above translation is done
to determine the appropriate parser, so the 'vo' parser (if any) is
called if output.format is 'VOTable', and the 'txt' parser (if any) is
called if output.format is 'ASCII'. If output.format is 'HTML', you will
need to explicitly set up a parser for that.
The type of HTTP interaction depends on the setting of the
post attribute: if true a POST is done; otherwise all arguments are
tacked onto the end of the URL and a GET is done.
The Astro::SIMBAD::Client attributes are documented below. The type of the
attribute is given after the attribute name, in parentheses. The types are:
boolean - a true/false value (in the Perl sense);
hash - a reference to one or more key/value pairs;
integer - an integer;
string - any characters.
Hash values may be specified either as hash references or as
strings. When a hash value is set, the given value updates the hash rather
than replacing it. For example, specifying
$simbad->set (format => {txt => '%MAIN_ID\n'});
does not affect the value of the vo format. If a key is set to the
null value, it deletes the key. All keys in the hash can be deleted by
setting key 'clear' to any true value.
When specifying a string for a hash-valued attribute, it must be
of the form 'key=value'. For example,
$simbad->set (format => 'txt=%MAIN_ID\n');
does the same thing as the previous example. Specifying the key
name without an = sign deletes the key (e.g. set (format => 'txt')).
The Astro::SIMBAD::Client class has the following attributes:
- autoload
- This Boolean attribute determines whether setting the parser should
attempt to autoload its package.
The default is 1 (i.e. true).
- debug
- This integer attribute turns on debug output. It is unsupported in the
sense that the author makes no claim what will happen if it is non-zero.
The default value is 0.
- delay
- This numeric attribute sets the minimum delay in seconds between requests,
so as not to overload the SIMBAD server. If Time::HiRes can be loaded, you
can set delays in fractions of a second; otherwise the delays will be
rounded to the nearest second.
Delays are from the time of the last request to the server, no
matter which object issued the request. The delay can be set to 0, but
not to a negative number.
The default is 3.
- emulate_soap_queries
- If this Boolean attribute is true, the methods that would normally use the
SOAP interface (that is, "query()" and
friends) use the script interface instead.
The purpose of this attribute is to give the user a way to
manage the deprecation and ultimate removal of the SOAP interface from
the SIMBAD servers. It may go away once that interface disappears, but
it will be put through a deprecation cycle.
The default is false, but will become true once the University
of Strasbourg shuts down its SOAP server.
- format
- This attribute holds the default format for a given query() output
type. It is specified as a reference to a hash. See
<http://simbad.u-strasbg.fr/guide/sim-fscript.htx> for how to
specify formats for each output type. Output type 'script' is used to
specify a format for the script() method.
The format can be specified either literally, or as a
subroutine name or code reference. A string is assumed to be a
subroutine name if it looks like one (i.e. matches (\w+::)*\w+), and if
the given subroutine is actually defined. If no namespace is specified,
all namespaces in the call tree are checked. If a code reference or
subroutine name is specified, that code is executed, and the result
becomes the format.
The following formats are defined in this module:
FORMAT_TXT_SIMPLE_BASIC -
a simple-to-parse text format providing basic information;
FORMAT_TXT_YAML_BASIC -
pseudo-YAML (parsable by YAML::Load) providing basic info;
FORMAT_VO_BASIC -
VOTable field names providing basic information.
The FORMAT_TXT_YAML_BASIC format attempts to provide data
structured similarly to the output of Astro::SIMBAD, though
Astro::SIMBAD::Client does not bless the output into any class.
A simple way to examine these formats is (e.g.)
use Astro::SIMBAD::Client;
print Astro::SIMBAD::Client->FORMAT_TXT_YAML_BASIC;
Before a format is actually used it is preprocessed in a
manner depending on its intended output type. For 'vo' formats, leading
and trailing whitespace are stripped. For 'txt' and 'script' formats,
line breaks are stripped.
The default specifies formats for output types 'txt' and 'vo'.
The 'txt' default is FORMAT_TXT_YAML_BASIC; the 'vo' default is
FORMAT_VO_BASIC.
There is no way to specify a default format for the
'script_file' method.
- parser
- This attribute specifies the parser for a given output type. The actual
value is a hash reference; the keys are valid output types, and the values
are as described below.
Parsers may be specified by either a code reference, or by the
text name of a subroutine. If specified as text and the name is not
qualified by a package name, the calling package is assumed. The parser
must be defined, and must take as its lone argument the text to be
parsed.
If the parser for a given output type is defined, query
results of that type will be passed to the parser, and the result
returned. Otherwise the query results will be returned verbatim.
The output types are anything legal for the query()
method (i.e. 'txt' and 'vo' at the moment), plus 'script' for a script
parser. All default to '', meaning no parser is used.
- post
- This Boolean attribute specifies that url_query() data should be
acquired using a POST request. If false, a GET request is used.
The default is 1 (i.e. true).
- scheme
- This string attribute specifies the server's URI scheme to be used. As of
January 27 2017, either 'http' or
'https' is valid.
The default is the value of environment variable
"ASTRO_SIMBAD_CLIENT_SCHEME", or
'http' if the environment variable is not set,
or if it contains a value other than 'http' or
'https', case-insensitive.
- server
- This string attribute specifies the server to be used. As of March 10
2010, either 'simbad.u-strasbg.fr' or
'simbad.cfa.harvard.edu' is valid.
The default is the value of environment variable
ASTRO_SIMBAD_CLIENT_SERVER, or
'simbad.u-strasbg.fr' if the environment
variable is not set.
- type
- This string attribute specifies the default output type. Note that
although SIMBAD only defined types 'txt' and 'vo', we do not validate
this, since the SIMBAD web site hints at more types to come. SIMBAD
appears to treat an unrecognized type as 'txt'.
The default is 'txt'.
- url_args
- This attribute specifies default arguments for url_query method as a
reference to a hash of argument name/value pairs. These will be applied
only if not specified in the method call. Any argument given in the SIMBAD
documentation may be specified. For example:
$simbad->set( url_args => { coodisp1 => 'd' } );
causes the query to return coordinates in degrees and decimals
rather than in sexagesimal (degrees, minutes, and seconds or hours,
minutes, and seconds, as the case may be.) Note, however, that VOTable
output does not seem to be affected by this.
The initial default for this attribute is an empty hash; that
is, no arguments are defaulted by this mechanism.
- verbatim
- This Boolean attribute specifies whether
"script()" and
"script_file()" are to strip the front
matter from the script output. If false, everything up to and including
the '::data:::::' line is removed before passing the output to the parser
or returning it to the user. If true, the script output is passed to the
parser or returned to the user unmodified.
The default is 0 (i.e. false).
If assigned a true value, this environment variable specifies the default for
the 'scheme' attribute. It is read when the module is
loaded. If you want to change the default after the module has been loaded,
make a static call to "set()".
If assigned a true value, this environment variable specifies the default for
the 'server' attribute. It is read when the module is
loaded. If you want to change the default after the module has been loaded,
make a static call to "set()".
The Boolean inverse of this environment variable specifies the default for the
'emulate_soap_queries' attribute. It is read when the
module is loaded. If you want to change the default after the module has been
loaded, make a static call to "set()".
The following environment variables control use of a proxy server. They are
implemented by LWP::UserAgent, but are documented fairly obscurely, so I have
chosen to say a few words about them here:
PERL_LWP_ENV_PROXY
If this environment variable is set to a true value,
LWP::UserAgent will take proxy settings for each URL scheme from environment
variables named "xxxx_proxy" (yes,
lower-case), where the 'xxxx' is the scheme name.
The content of each scheme-specific environment variables is the URL
(scheme, host, and port) of the proxy. The following are relevant to users
of this module:
http_proxy
This environment variable is set to the URL of the
"http:" proxy server.
This environment variable is set to the URL of the
"http:" proxy server.
Support is by the author. Please file bug reports at
<https://rt.cpan.org/Public/Dist/Display.html?Name=Astro-SIMBAD-Client>,
<https://github.com/trwyant/perl-Astro-SIMBAD-Client/issues>, or in
electronic mail to the author.
Thomas R. Wyant, III (wyant at cpan dot org)
Copyright (C) 2005-2021 by Thomas R. Wyant, III
This program is free software; you can redistribute it and/or
modify it under the same terms as Perl 5.10.0. For more details, see the
full text of the licenses in the directory LICENSES.
This program is distributed in the hope that it will be useful,
but without any warranty; without even the implied warranty of
merchantability or fitness for a particular purpose.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |