|
NAMEBio::DB::SoapEUtilities - Interface to the NCBI Entrez web service *BETA*SYNOPSISuse Bio::DB::SoapEUtilities; # factory construction my $fac = Bio::DB::SoapEUtilities->new() # executing a utility call #get an iteratable adaptor my $links = $fac->elink( -dbfrom => 'protein', -db => 'taxonomy', -id => \@protein_ids )->run(-auto_adapt => 1); # get a Bio::DB::SoapEUtilities::Result object my $result = $fac->esearch( -db => 'gene', -term => 'sonic and human')->run; # get the raw XML message my $xml = $fac->efetch( -db => 'gene', -id => \@gids )->run( -raw_xml => 1 ); # change parameters my $new_result = $fac->efetch( -db => 'gene', -id => \@more_gids)->run; # reset parameters $fac->efetch->reset_parameters( -db => 'nucleotide', -id => $nucid ); $result = $fac->efetch->run; # parsing and iterating the results $count = $result->count; @ids = $result->ids; while ( my $linkset = $links->next_link ) { $submitted = $linkset->submitted_id; } ($taxid) = $links->id_map($submitted_prot_id); $species_io = $fac->efetch( -db => 'taxonomy', -id => $taxid )->run( -auto_adapt => 1); $species = $species_io->next_species; $linnaeus = $species->binomial; DESCRIPTIONThis module allows the user to query the NCBI Entrez database via its SOAP (Simple Object Access Protocol) web service (described at <http://eutils.ncbi.nlm.nih.gov/entrez/eutils/soap/v2.0/DOC/esoap_help.html>). The basic tools ("einfo, esearch, elink, efetch, espell, epost") are available as methods off a "SoapEUtilities" factory object. Parameters for each tool can be queried, set and reset for each method through the Bio::ParameterBaseI standard calls ("available_parameters(), set_parameters(), get_parameters(), reset_parameters()"). Returned data can be retrieved, accessed and parsed in several ways, according to user preference. Adaptors and object iterators are available for "efetch", "egquery", "elink", and "esummary" results.USAGEThe "SoapEU" system has been designed to be as easy (few includes, available parameter facilities, reasonable defaults, intuitive aliases, built-in pipelines) or as complex (accessors for underlying low-level objects, all parameters accessible, custom hooks for builder objects, facilities for providing local copies of WSDLs) as the user requires or desires. (To the extent that it does not succeed in either direction, it is up to the user to report to the mailing list ("FEEDBACK")!)FactoryTo begin, make a factory:my $fac = Bio::DB::SoapEUtilities->new(); From the factory, utilities are called, parameters are set, and results or adaptors are retrieved. If you have your own copy of the wsdl, use my $fac = Bio::Db::SoapEUtilities->new( -wsdl_file => $my_wsdl ); otherwise, the correct one will be obtained over the network (by Bio::DB::ESoap and friends). Utilities and parametersTo run any of the standard NCBI EUtilities ("einfo, esearch, esummary, elink, egquery, epost, espell"), call the desired utility from the factory. To use a utility, you must set its parameters and run it to get a result. TMTOWTDI:# verbose my $fetch = $fac->efetch(); $fetch->set_parameters( -db => 'gene', -id => [828392, 790]); my $result = $fetch->run; # compact my $result = $fac->efetch(-db =>'gene',-id => [828392,790])->run; # change ids $fac->efetch->set_parameters( -id => 470338 ); $result = $fac->run; # another util $result = $fac->esearch(-db => 'protein', -term => 'BRCA and human')->run; # the utilities are kept separate %search_params = $fac->esearch->get_parameters; %fetch_params = $fac->efetch->get_parameters; $search_param{db}; # is 'protein' $fetch_params{db}; # is 'gene' The factory is Bio::ParameterBaseI compliant: that means you can find out what you can set with @available_search = $fac->esearch->available_parameters; @available_egquery = $fac->egquery->available_parameters; For more information on parameters, see <http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html>. ResultsThe "intermediate" object for "SoapEU" query results is the Bio::DB::SoapEUtilities::Result. This is a BioPerly parsing of the SOAP message sent by NCBI when a query is "run()". This can be very useful on it's own, but most users will likely want to proceed directly to "Adaptors", which take a "Result" and turn it into more intuitive/familiar BioPerl objects. Go there if the following details are too gory.Results can be highly- or lowly-parsed, depending on the parameters passed to the factory "run()" method. To get the raw XML message with no parsing, do my $xml = $fac->$util->run(-raw_xml => 1); # $xml is a scalar string To retrieve a Bio::DB::SoapEUtilities::Result object with limited parsing, but with accessors to the SOAP::SOM message (provided by SOAP::Lite), do my $result = $fac->$util->run(-no_parse => 1); my $som = $result->som; my $method_hash = $som->method; # etc... To retrieve a "Result" object with message elements parsed into accessors, including "count()" and "ids()", run without arguments: my $result = $fac->esearch->run() my $count = $result->count; my @Count = $result->Count; # counts for each member of # the translation stack my @ids = $result->IdList_Id; # from automatic message parsing @ids = $result->ids; # a convenient alias See Bio::DB::SoapEUtilities::Result for more, even gorier details. AdaptorsAdaptors convert EUtility "Result"s into convenient objects, via a handle that usually provides an iterator, in the spirit of Bio::SeqIO. These are probably more useful than the "Result" to the typical user, and so you can retrieve them automatically by setting the "run()" parameter "-auto_adapt =" 1>.In general, retrieve an adaptor like so: $adp = $fac->$util->run( -auto_adapt => 1 ); # iterate... while ( my $obj = $adp->next_obj ) { # do stuff with $obj } The adaptor itself occasionally possesses useful methods besides the iterator. The method "next_obj" always works, but a natural alias is also always available: $seqio = $fac->esearch->run( -auto_adapt => 1 ); while ( my $seq = $seqio->next_seq ) { # do stuff with $seq } In the above example, "-auto_adapt =" 1> also instructs the factory to perform an "efetch" based on the ids returned by the "esearch" (if any), so that the adaptor returned iterates over Bio::SeqI objects. Here is a rundown of the different adaptor flavors:
Web environments and query keysTo make large or complex requests for data, or to share queries, it may be helpful to use the NCBI WebEnv system to manage your queries. Each EUtility accepts the following parameters:-usehistory -WebEnv -QueryKey for this purpose. These store the details of your queries serverside. "SoapEU" attempts to make using these relatively straightforward. Use "Result" objects to obtain the correct parameters, and don't forget "-usehistory": my $result1 = $fac->esearch( -term => 'BRCA and human', -db => 'nucleotide', -usehistory => 1 )->run( -no_parse=>1 ); my $result = $fac->esearch( -term => 'AND early onset', -QueryKey => $result1->query_key, -WebEnv => $result1->webenv )->run( -no_parse => 1 ); my $result = $fac->esearch( -db => 'protein', -term => 'sonic', -usehistory => 1 )->run( -no_parse => 1 ); # later (but not more than 8 hours later) that day... $result = $fac->esearch( -WebEnv => $result->webenv, -QueryKey => $result->query_key, -RetMax => 800 # get 'em all )->run; # note we're parsing the result... @all_ids = $result->ids; Error checkingTwo kinds of errors can ensue on an Entrez SOAP run. One is a SOAP fault, and the other is an error sent in non-faulted SOAP message from the server. The distinction is probably systematic, and I would welcome an explanation of it. To check for result errors, try something like:unless ( $result = $fac->$util->run ) { die $fac->errstr; # this will catch a SOAP fault } # a valid result object was returned, but it may carry an error if ($result->count == 0) { warn "No hits returned"; if ($result->ERROR) { warn "Entrez error : ".$result->ERROR; } } Error handling will be improved in the package eventually. SEE ALSOBio::DB::EUtilities, Bio::DB::SoapEUtilities::Result, Bio::DB::ESoap.FEEDBACKMailing ListsUser feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated.bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists SupportPlease direct usage questions or support issues to the mailing list:bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting BugsReport bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web:http://redmine.open-bio.org/projects/bioperl/ AUTHOR - Mark A. JensenEmail maj -at- fortinbras -dot- usAPPENDIXThe rest of the documentation details each of the object methods. Internal methods are usually preceded with a _newTitle : new Usage : my $eutil = new Bio::DB::SoapEUtilities(); Function: Builds a new Bio::DB::SoapEUtilities object Returns : an instance of Bio::DB::SoapEUtilities Args : run()Title : run Usage : $fac->$eutility->run(@args) Function: Execute the EUtility Returns : true on success, false on fault or error (reason in errstr(), for more detail check the SOAP message in last_result() ) Args : named params appropriate to utility -auto_adapt => boolean ( return an iterator over results as appropriate to util if true) -raw_xml => boolean ( return raw xml result; no processing ) Bio::DB::SoapEUtilities::Result constructor parms Useful Accessorsresponse_message()Title : response_message Aliases : last_response, last_result Usage : $som = $fac->response_message Function: get the last response message Returns : a SOAP::SOM object Args : none webenv()Title : webenv Usage : Function: contains WebEnv key referencing the session (set after run() ) Returns : scalar Args : none errstr()Title : errstr Usage : $fac->errstr Function: get the last error, if any Example : Returns : value of errstr (a scalar) Args : none Bio::ParameterBaseI complianceavailable_parameters()Title : available_parameters Usage : Function: get available request parameters for calling utility Returns : Args : -util => $desired_utility [optional, default is caller utility] set_parameters()Title : set_parameters Usage : Function: Returns : none Args : -util => $desired_utility [optional, default is caller utility], named utility arguments get_parameters()Title : get_parameters Usage : Function: Returns : array of named parameters Args : utility (scalar string) [optional] (default is caller utility) reset_parameters()Title : reset_parameters Usage : Function: Returns : none Args : -util => $desired_utility [optional, default is caller utility], named utility arguments parameters_changed()Title : parameters_changed Usage : Function: Returns : boolean Args : utility (scalar string) [optional] (default is caller utility) _soap_facs()Title : _soap_facs Usage : $self->_soap_facs($util, $fac) Function: caches Bio::DB::ESoap factories for the eutils in use by this instance Example : Returns : Bio::DB::ESoap object Args : $eutility, [optional on set] $esoap_factory_object _caller_util()Title : _caller_util Usage : $self->_caller_util($newval) Function: the utility requested off the main SoapEUtilities object Example : Returns : value of _caller_util (a scalar string, a valid eutility) Args : on set, new value (a scalar string [optional])
Visit the GSP FreeBSD Man Page Interface. |