|
NAMEBio::GFF3::LowLevel - fast, low-level functions for parsing and formatting GFF3SYNOPSISuse Bio::GFF3::LowLevel qw/ gff3_parse_feature /; open my $gff3_fh, 'myfile.gff3' or die; while( <$gff3_fh> ) { next if /^#/; my $feat = gff3_parse_feature( $_ ); } DESCRIPTIONThese are low-level, fast functions for parsing GFF version 3 files. All they do is convert back and forth between low-level Perl data structures and GFF3 text.Sometimes this is what you need when you are just doing simple transformations on GFF3. I found myself writing these functions over and over again, until I finally got fed up enough to just package them up properly. These functions do no validation, do not reconstruct feature hierarchies, or anything like that. If you want that, use Bio::FeatureIO. All of the functions in this module are EXPORT_OK, meaning that you can add their name after using this module to make them available in your namespace. FUNCTIONSgff3_parse_feature( $line )Given a string containing a GFF3 feature line (i.e. not a comment), parses it and returns a hashref of its information, of the form:{ seq_id => 'chr02', source => 'AUGUSTUS', type => 'transcript', start => '23486', end => '48209', score => '0.02', strand => '+', phase => undef, attributes => { ID => [ 'chr02.g3.t1' ], Parent => [ 'chr02.g3' ], }, } Note that all values are simple scalars, except for "attributes", which is a hashref as returned by "gff3_parse_attributes" below. Unescaping is performed according to the GFF3 specification. gff3_parse_attributes( $attr_string )Given a GFF3 attribute string, parse it and return a hashref of its data, of the form:{ 'attribute_name' => [ value, value, ... ], ... } Always returns a hashref. If the passed attribute string is undefined, or ".", the hashref returned will be empty. Attribute values are always arrayrefs, even if they have only one value. gff3_parse_directive( $line )Parse a GFF3 directive/metadata line. Returns a hashref as:{ directive => 'directive-name', value => 'the contents of the directive' } Or nothing if the line could not be parsed as a GFF3 directive. In addition, "sequence-region" and "genome-build" directives are parsed further. "sequence-region" hashrefs have additional "seq_id", "start", and "end" keys, and "genome-build" hashrefs have additional "source" and "buildname" keys gff3_format_feature( \%fields )Given a hashref of feature information in the same format returned by "gff3_parse_feature" above, constructs a correctly-escaped line of GFF3 encoding that information.The line ends with a single newline character, a UNIX-style line ending, regardless of the local operating system. gff3_format_attributes( \%attrs )Given a hashref of GFF3 attributes in the same format returned by "gff3_parse_attributes" above, returns a correctly formatted and escaped GFF3 attribute string (the 9th column of a GFF3 feature line) encoding those attributes.For convenience, single-valued attributes can have simple scalars as values in the passed hashref. For example, if a feature has only one "ID" attribute (as it should), you can pass "{ ID => 'foo' }" instead of "{ ID => ['foo'] }}". gff3_escape( $string )Given a string, escapes special characters in that string according to the GFF3 specification.gff3_unescape( $string )Unescapes a GFF3-escaped string.AUTHORRobert Buels <rmb32@cornell.edu>COPYRIGHT AND LICENSEThis software is copyright (c) 2012 by Robert Buels.This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
Visit the GSP FreeBSD Man Page Interface. |