|
|
| |
RNALALIFOLD(1) |
User Commands |
RNALALIFOLD(1) |
RNALalifold - manual page for RNALalifold 2.5.0
RNALalifold [options] <file1.aln>
RNALalifold 2.5.0
calculate locally stable secondary structures for a set of aligned
RNAs
reads aligned RNA sequences from stdin or file.aln and calculates
locally stable RNA secondary structure with a maximal base pair span. For a
sequence of length n and a base pair span of L the algorithm uses only
O(n+L*L) memory and O(n*L*L) CPU time. Thus it is practical to
"scan" very large genomes for short RNA
- structures.
- -h, --help
- Print help and exit
- --detailed-help
- Print help, including all details and hidden options, and exit
- --full-help
- Print help, including hidden options, and exit
- -V, --version
- Print version and exit
- Command line options which alter the general behavior of this program
- -v, --verbose
- Be verbose.
- (default=off)
- -q, --quiet
- Be quiet. (default=off)
- This option can be used to minimize the output of additional information
and non-severe warnings which otherwise might spam stdout/stderr.
- --noconv
- Do not automatically substitute nucleotide "T" with
"U"
- (default=off)
- -f, --input-format=C|S|F|M
- File format of the input multiple sequence alignment (MSA).
- If this parameter is set, the input is considered to be in a particular
file format. Otherwise, the program tries to determine the file format
automatically, if an input file was provided in the set of parameters. In
case the input MSA is provided in interactive mode, or from a terminal
(TTY), the programs default is to assume CLUSTALW format. Currently, the
following formats are available: ClustalW (C), Stockholm 1.0 (S),
FASTA/Pearson (F), and MAF (M).
- --csv
- Create comma separated output (csv)
- (default=off)
- --aln[=prefix]
- Produce output alignments and secondary structure plots for each hit
found.
- This option tells the program to produce, for each hit, a colored and
structure annotated (sub)alignment and secondary structure plot in
PostScript format. It also adds the subalignment hit into a
multi-Stockholm formatted file "RNALalifold_results.stk". The
postscript output file names are "aln_start_end.eps" and
"ss_start_end.eps". All files will be created in the current
directory. The optional argument string can be used to set a specific
prefix that is used to name the output files. The file names then become
"prefix_aln_start_end.eps", "prefix_ss_start_end.eps",
and "prefix.stk". Note: Any special characters in the prefix
will be replaced by the filename delimiter, hence there is no way to pass
an entire directory path through this option yet. (See also the
"--filename-delim" parameter)
- --aln-EPS[=prefix]
- Produce colored and structure annotated subalignment for each hit
- The default file name used for the output is "aln_start_end.eps"
where "start" and "end" denote the first and last
column of the subalignment relative to the input (1-based). Users may
change the filename to "prefix_aln_start_end.eps" by specifying
the prefix as optional argument. Files will be create in the current
directory. Note: Any special characters in the prefix will be replaced by
the filename delimiter, hence there is no way to pass an entire directory
path through this option yet. (See also the "--filename-delim"
parameter)
- --aln-EPS-cols=INT
- Number of columns in colored EPS alignment output.
- (default=`60')
- A value less than 1 indicates that the output should not be wrapped at
all.
- --aln-EPS-ss[=prefix]
- Produce colored consensus secondary structure plots in PostScript
format
- The default file name used for the output is "ss_start_end.eps"
where "start" and "end" denote the first and last
column of the subalignment relative to the input (1-based). Users may
change the filename to "prefix_ss_start_end.eps" by specifying
the prefix as optional argument. Files will be create in the current
directory. Note: Any special characters in the prefix will be replaced by
the filename delimiter, hence there is no way to pass an entire directory
path through this option yet. (See also the "--filename-delim"
parameter)
- --aln-stk[=prefix]
- Add hits to a multi-Stockholm formatted output file.
- (default=`RNALalifold_results')
- The default file name used for the output is
"RNALalifold_results.stk". Users may change the filename to
"prefix.stk" by specifying the prefix as optional argument. The
file will be create in the current directory if it does not already exist.
In case the file already exists, output will be appended to it. Note: Any
special characters in the prefix will be replaced by the filename
delimiter, hence there is no way to pass an entire directory path through
this option yet. (See also the "--filename-delim"
parameter)
- --auto-id
- Automatically generate an ID for each alignment.
- (default=off)
- The default mode of RNALalifold is to automatically determine an ID from
the input alignment if the input file format allows to do that. Alignment
IDs are, for instance, usually given in Stockholm 1.0 formatted input. If
this flag is active, RNALalifold ignores any IDs retrieved from the input
and automatically generates an ID for each alignment.
- --id-prefix=prefix
- Prefix for automatically generated IDs (as used in output file names)
- (default=`alignment')
- If this parameter is set, each alignment will be prefixed with the
provided string. Hence, the output files will obey the following naming
scheme: "prefix_xxxx_ss.ps" (secondary structure plot),
"prefix_xxxx_dp.ps" (dot-plot), "prefix_xxxx_aln.ps"
(annotated alignment), etc. where xxxx is the alignment number beginning
with the second alignment in the input. Use this setting in conjunction
with the --continuous-ids flag to assign IDs beginning with the
first input alignment.
- --id-delim=delimiter
- Change the delimiter between prefix and increasing number for
automatically generated IDs (as used in output file names)
- (default=`_')
- This parameter can be used to change the default delimiter "_"
between
- the prefix string and the increasing number for automatically generated
ID.
- --id-digits=INT
- Specify the number of digits of the counter in automatically generated
alignment IDs.
- (default=`4')
- When alignments IDs are automatically generated, they receive an
increasing number, starting with 1. This number will always be left-padded
by leading zeros, such that the number takes up a certain width. Using
this parameter, the width can be specified to the users need. We allow
numbers in the range [1:18].
- --id-start=LONG
- Specify the first number in automatically generated alignment IDs.
- (default=`1')
- When alignment IDs are automatically generated, they receive an increasing
number, usually starting with 1. Using this parameter, the first number
can be specified to the users requirements. Note: negative numbers are not
allowed. Note: Setting this parameter implies continuous alignment IDs,
i.e. it activates the --continuous-ids flag.
- --filename-delim=delimiter
- Change the delimiting character that is used
- for sanitized filenames
- (default=`ID-delimiter')
- This parameter can be used to change the delimiting character used while
sanitizing filenames, i.e. replacing invalid characters. Note, that the
default delimiter ALWAYS is the first character of the "ID
delimiter" as supplied through the --id-delim option. If the
delimiter is a whitespace character or empty, invalid characters will be
simply removed rather than substituted. Currently, we regard the following
characters as illegal for use in filenames: backslash '\', slash '/',
question mark '?', percent sign '%', asterisk '*', colon ':', pipe symbol
'|', double quote '"', triangular brackets '<' and '>'.
- --split-contributions
- Split the free energy contributions into separate parts
- (default=off)
- By default, only the total energy contribution for each hit is returned.
Using this option, this contribution is split into individual parts, i.e.
the Nearest Neighbor model energy, the covariance pseudo energy, and if
applicable, a remaining pseudo energy derived from special constraints,
such as probing signals like SHAPE.
- Command line options to interact with the structure constraints feature of
this program
- --shape=file1,file2
- Use SHAPE reactivity data to guide structure predictions
- Multiple shapefiles for the individual sequences in the alignment may be
specified as a comma separated list. An optional association of particular
shape files to a specific sequence in the alignment can be expressed by
prepending the sequence number to the filename, e.g.
"5=seq5.shape,3=seq3.shape" will assign the reactivity values
from file seq5.shape to the fifth sequence in the alignment, and the
values from file seq3.shape to sequence 3. If no assignment is specified,
the reactivity values are assigned to corresponding sequences in the order
they are given.
- --shapeMethod=D[mX][bY]
- Specify the method how to convert SHAPE reactivity data to pseudo energy
contributions
- (default=`D')
- Currently, the only data conversion method available is that of to Deigan
et al 2009. This method is the default and is recognized by a capital 'D'
in the provided parameter, i.e.: --shapeMethod="D" is the
default setting. The slope 'm' and the intercept 'b' can be set to a
non-default value if necessary. Otherwise m=1.8 and b=-0.6 as stated in
the paper mentionen before. To alter these parameters, e.g. m=1.9 and
b=-0.7, use a parameter string like this:
--shapeMethod="Dm1.9b-0.7". You may also provide only one
of the two parameters like: --shapeMethod="Dm1.9" or
--shapeMethod="Db-0.7".
- Select additional algorithms which should be included in the calculations.
The Minimum free energy (MFE) and a structure representative are
calculated in any case.
- -L, --maxBPspan=INT
- Set the maximum allowed separation of a base pair to span. I.e. no pairs
(i,j) with j-i>span will be allowed.
- (default=`70')
- --threshold=DOUBLE
- Energy threshold in kcal/mol per nucleotide above which secondary
structure hits are omitted in the output.
- (default=`-0.1')
- --mis
- Output "most informative sequence" instead of simple consensus:
For each column of the alignment output the set of nucleotides with
frequency greater than average in IUPAC notation.
- (default=off)
- -g, --gquad
- Incoorporate G-Quadruplex formation into the structure prediction
algorithm
- (default=off)
- -T, --temp=DOUBLE
- Rescale energy parameters to a temperature of temp C. Default is 37C.
- -4, --noTetra
- Do not include special tabulated stabilizing energies for tri-, tetra- and
hexaloop hairpins. Mostly for testing.
- (default=off)
- -d, --dangles=INT
- How to treat "dangling end" energies for bases adjacent to
helices in free ends and multi-loops
- (default=`2')
- With -d1 only unpaired bases can participate in at most one
dangling end. With -d2 this check is ignored, dangling energies
will be added for the bases adjacent to a helix on both sides in any case;
this is the default for mfe and partition function folding (-p).
The option -d0 ignores dangling ends altogether (mostly for
debugging). With -d3 mfe folding will allow coaxial stacking of
adjacent helices in multi-loops. At the moment the implementation will not
allow coaxial stacking of the two interior pairs in a loop of degree 3 and
works only for mfe folding.
- Note that with -d1 and -d3 only the MFE computations will be
using this setting while partition function uses -d2 setting, i.e.
dangling ends will be treated differently.
- --noLP
- Produce structures without lonely pairs (helices of length 1).
- (default=off)
- For partition function folding this only disallows pairs that can only
occur isolated. Other pairs may still occasionally occur as helices of
length 1.
- --noGU
- Do not allow GU pairs
- (default=off)
- --noClosingGU
- Do not allow GU pairs at the end of helices
- (default=off)
- -P, --paramFile=paramfile
- Read energy parameters from paramfile, instead of using the default
parameter set.
- Different sets of energy parameters for RNA and DNA should accompany your
distribution. See the RNAlib documentation for details on the file format.
When passing the placeholder file name "DNA", DNA parameters are
loaded without the need to actually specify any input file.
- --nsp=STRING
- Allow other pairs in addition to the usual AU,GC,and GU pairs.
- Its argument is a comma separated list of additionally allowed pairs. If
the first character is a "-" then AB will imply that AB and BA
are allowed pairs. e.g. RNAfold -nsp -GA will allow GA and
AG pairs. Nonstandard pairs are given 0 stacking energy.
- -e, --energyModel=INT
- Rarely used option to fold sequences from the artificial ABCD... alphabet,
where A pairs B, C-D etc. Use the energy parameters for GC (-e 1)
or AU (-e 2) pairs.
- --cfactor=DOUBLE
- Set the weight of the covariance term in the energy function
- (default=`1.0')
- --nfactor=DOUBLE
- Set the penalty for non-compatible sequences in the covariance term of the
energy function
- (default=`1.0')
- -R, --ribosum_file=ribosumfile
- use specified Ribosum Matrix instead of normal
- energy model. Matrixes to use should be 6x6
- matrices, the order of the terms is AU, CG, GC, GU, UA, UG.
- -r, --ribosum_scoring
- use ribosum scoring matrix. The matrix is chosen according to the minimal
and maximal pairwise identities of the sequences in the file.
- (default=off)
If you use this program in your work you might want to cite:
R. Lorenz, S.H. Bernhart, C. Hoener zu Siederdissen, H. Tafer, C.
Flamm, P.F. Stadler and I.L. Hofacker (2011), "ViennaRNA Package
2.0", Algorithms for Molecular Biology: 6:26
I.L. Hofacker, W. Fontana, P.F. Stadler, S. Bonhoeffer, M. Tacker,
P. Schuster (1994), "Fast Folding and Comparison of RNA Secondary
Structures", Monatshefte f. Chemie: 125, pp 167-188
R. Lorenz, I.L. Hofacker, P.F. Stadler (2016), "RNA folding
with hard and soft constraints", Algorithms for Molecular Biology 11:1
pp 1-13
I.L. Hofacker, B. Priwitzer, and P.F. Stadler (2004),
"Prediction of Locally Stable RNA Secondary Structures for Genome-Wide
Surveys", Bioinformatics: 20, pp 186-190
Stephan H. Bernhart, Ivo L. Hofacker, Sebastian Will, Andreas R.
Gruber, and Peter F. Stadler (2008), "RNAalifold: Improved consensus
structure prediction for RNA alignments", BMC Bioinformatics: 9, pp
474
The energy parameters are taken from:
D.H. Mathews, M.D. Disney, D. Matthew, J.L. Childs, S.J.
Schroeder, J. Susan, M. Zuker, D.H. Turner (2004), "Incorporating
chemical modification constraints into a dynamic programming algorithm for
prediction of RNA secondary structure", Proc. Natl. Acad. Sci. USA:
101, pp 7287-7292
D.H Turner, D.H. Mathews (2009), "NNDB: The nearest neighbor
parameter database for predicting stability of nucleic acid secondary
structure", Nucleic Acids Research: 38, pp 280-282
Ivo L Hofacker, Ronny Lorenz
If in doubt our program is right, nature is at fault. Comments should be sent to
rna@tbi.univie.ac.at.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |