|
|
| |
MIGRATE(1) |
FreeBSD General Commands Manual |
MIGRATE(1) |
MIGRATE - estimate population parameters: migration rate and population size
Migrate estimates population parameters (effective population size and migration
rates) using genetic data (Electrophoretic markers, microsatellite markers,
sequence data, and single nucleotide polymorphism data). It is a maximum
likelihood estimator or Bayesian estimator and uses a coalescent theory
approach taking into account history of mutations and uncertainty of the
genealogy.
or get a copy of the manual in PDF format from
http://popgen.scs.fsu.edu
there are no options on the commandline, but you can specify the options in a
parmfile or in the menu
The parmfile options are split into Datatype, Input/Output, Start parameters,
Search strategy
- datatype=<Allele | Microsatellites | Brownian | Sequences |
Nucleotide-polymorphisms | Panel-SNP | Genealogies >
- specifies the datatype used for the analyses, needless to say that if you
have the wrong data for the chosen type the program will crash.
Allele: infinite allele model, suitable for electrophoretic
markers, perhaps the "best" guess for codominant markers of
which we do not know the mutation model. Microsatellite: a simple
electrophoretic ladder model is used for the change along the branches in
genealogy. Brownian: a Brownian motion approximation to the
stepwise mutation model for microsatellites us used (this is MUCH faster
than exact model, but is not a good approximation if population sizes are
small (say below 10). Sequences: Data are DNA or RNA sequences and
the mutation model used is F84, first used by Felsenstein 1984 (actually
the same as in dnaml (Phylip version 3.5), a description of this model can
be found in Swofford et al. 1996. Nucleotide-polymorphism: [SNP]
the data likelihood is corrected for sampling only variable sites. We
assume that the data was used to find the SNP. Panel-SNP: the data
likelihood is corrected for using a panel of SNP sites, that were
polymorphic. The panel has to be population 1. Genealogies: Reads
the sumfile of a previous run, with this options the genealogy
sampling step will not be done and the genealogies provided in the sumfile
are analyzed. This datatype makes it easy to rerun the program for
different likelihood ratio test or different settings for the profile
likelihood printouts.
- freq-from-data=< Yes | No:freqA freqG freqC freqT>
- ttratio=< r1 r2 .....>
- interleaved=<Yes | No >
- categories=<Yes | No>
- If you specify Yes you need a file named catfile
in the same directory with the following Syntax: number_of_categories cat1
cat2 cat3 .. categorylabel_for_each_site for each locus, a # in the first
column can be used to start a comment-line. Example is for a data set with
2 loci and 20 base pairs each
# Example catfile for two loci
# in migrate you can use # as comments
2 1 10 11111111112222222222
5 0.1 2 5 23 3 11111122223333445555
- rates=< n : r1 r2 r3 ..rn>
-
- prob-rates=< n : p1 p2 p3 ... pn>
-
- autocorrelation=<Yes:value | No>
- weights=<Yes | No>
- If you specify Yes you need a file weightfile with weights for each site,
the weights can be the following numbers 0-9 and letters A-Z, so you have
35 possible weights available.
# Example weightfile for two loci
11111111112222222222
1111112222AAAA445XXXX5
- distfile=<Yes | No>
- You can supply a distance file for each locus (using PHYLIP syntax). The
sequence of indiviudals must be same as in the infile. This option appears
in the menu when you choose
0 Start genealogy is estimated using a UPGMA
topology
The distance file is then used to create an UPGMA tree with a
minimal number of migration events. For large trees this is options help
to get better starting trees than the automatic tree
generation which uses a rather unsophisticated distance method
(differences).
- usertree=<Yes | No>
- If you specify Yes you need a file intree. In this file you have starting
trees for each locus. BUT these trees need to have migration events in
them!
- micro-threshold=value
- specifies the window in which probabilities of change are calculated if we
have allele 34 then only probabilities of a change from 34 to 35-44 and
24-34 are considered, the higher this value is the longer you wait for
your
result, choosing it too small will produce wrong results. Default is
micro-threshold=10
Similar to sequence data.
- infile=filename
- Default is infile
- random-seed=<Auto | Noauto | Own:seedvalue>
- The random number seed guarantees that you can reproduce a run exactly.
Good random number seeds are (values * 4) + 1. If you do not
specify the random number seed ( seed=Auto ) the program will use
the system clock. With seed=Noauto the program expects to find a
file named seedfile with the random number seed. With
random-seed=Own:seedvalue you can specify the seed value in the
parmfile (or in the menu).
- title=titletext
- progress=<Yes|No|Verbose>
- The default is progress=Yes
- outfile=filename
- The default is obviously outfile=outfile
- print-data=<Yes|No>
- Print the data in the outfile. Default is print-data=No.
- print-fst=<Yes|No>
- Print a table of an FST estimate for comparison (Beerli and Felsenstein
1999, Beerli 1998) [not recommended].
- plot=<No |
Yes>[:<Outfile|Both>[:<std|log>:{mig-axis-start,mig-axis-end,theta-axis-start,theta-axis-end}<:printpos<M
| Nm>>]]
- If plot=No then no plot of the parameter space is shown in the
outfile, if Yes then you can specify whether you want to have the
accurate numbers in a separate file ( mathfile ) using
printpos
"pixel" in each direction,or only the ASCII-graphics plot in the
outfile. The last option ( M or N )let you define
whether you want the plot in M=m/mu or (default) 4Nm units.
Default is plot=Yes:Outfile. Example of a more complicated
statement: plot=Yes:Both:std:0,10,0,0.025:100N For syntax in
mathfile see documentation
- profile=<No|Yes<:<Fast|Percentile|Spline|Discrete|Quick
>><:M | Nm >
-
Print profile likelihood. See section Likelihood ratio tests and profile
likelihood. Default
is profile=Yes:Fast:N.
- l-ratio=<None | <Mean|Loci>:testparam> (N-POP)
-
Likelihood ratio tests. See section Likelihood ratio tests and profile
likelihood. Default is l-ratio=None.
- print-trees=<All | None | Last | Best>
- Default is print-trees=None
- mathfile=filename
- sumfile=<No | Yes | Yes:filename >
- Intermediate results of the genealogy sampling process are save into a
file named sumfile or into the file for that you specify the
filename. You can use this sumfile to rerun the program for further
analysis, e.g. calculating likelihood ratios or profile likelihoods, see
datatype=Genealogy.
- theta=<Fst | Own:{value1,value2 ,...}>
- With Fst the programs tries to use an FST based measure (Maynard Smith
1970, Nei and Feldman 1972) Own: { value1, value2, ... }
defines arbitrary start values.
- migration=<Fst|Own:Migration matrix > (N-POP)
- The migration matrix is a n by n table with - on the diagonal and can look
like this for four populations migration=OWN:{ - 1.0 1.1 1.2 0.9 - 0.8
0.7 2.1 2.2 - 2.3 1.4 1.5 1.6 - } or like this
migration=OWN:{ - 1.0 1.1 1.2
0.9 - 0.8 0.7
2.1 2.2 - 2.3
1.4 1.5 1.6 - }
- mutation=<Gamma | NoGamma>
- The default is mutation=Nogamma
- fst-type=<Theta | Migration >
- custom-migration=< NONE|migration - matrix >
- The migration matrix contains the migration rates from j to i on row i,
and the are on the diagonal. The migration matrix can consist of
connections that are *: no restriction
0: not estimated
m: mean value of either 4Nm or M.
s: symmetric migration [only for M]
c: constant value (together with migration=OWN.. or
theta=OWN..)
The values can be spaced by blanks, newlines. A few examples
for 4 populations:
Full model: custom-migration={**** **** **** ****}
N-island model: custom-migration={m m m m mm mm m mmm
mmmm}
Stepping Stone model: with symmetric migrations, and
unrestricted estimates: custom-migration={*s00 s*s0 0s*s
00s*}
Source-Sink: (the first population is the source):
custom-migration={*000**000**0*000}
Please read the documentation ,these settings are important and will influence
the accuracy of your results.
- short-chains=value
- Default is 10.
- short-inc=value
- Default is 20.
- short-sample=value
- Default is 500.
- long-chains=value
- Default is 2.
- long-inc=value
- Default is 20.
- long-sample=value
- Default is 5000.
- burn-in=value
- Default is 10000.
- replicate=<NO | YES<:LONGCHAINS | number>>
- heating=<NO | YES<:{1,1.1,1.2,1.3}>>
This man page is not up to date and misses the Bayesian inference section, but
see documentation.
http://popgen.csit.fsu.edu
coalesce, fluctuate, recombine, lamarc (the program) available from
http://evolution.gs.washington.edu/lamarc.html
Peter Beerli <beerli@csit.fsu.edu>
- [if you use this man page, please let me know]
-
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |