samtools merge - merges multiple sorted files into a single file
samtools merge [options] -o out.bam [options]
in1.bam ... inN.bam
samtools merge [options] out.bam
in1.bam ... inN.bam
Merge multiple sorted alignment files, producing a single sorted output file
that contains all the input records and maintains the existing sort order.
The output file can be specified via -o as shown in the
first synopsis. Otherwise the first non-option filename argument is taken to
be out.bam rather than an input file, as in the second synopsis.
There is no default; to write to standard output (or to a pipe), use either
“-o -” or the equivalent using “-”
as the first filename argument.
If -h is specified the @SQ headers of input files will be
merged into the specified header, otherwise they will be merged into a
composite header created from the input headers. If in the process of
merging @SQ lines for coordinate sorted input files, a conflict arises as to
the order (for example input1.bam has @SQ for a,b,c and input2.bam has
b,a,c) then the resulting output file will need to be re-sorted back into
coordinate order.
Unless the -c or -p flags are specified then when
merging @RG and @PG records into the output header then any IDs found to be
duplicates of existing IDs in the output header will have a suffix appended
to them to differentiate them from similar header records from other files
and the read records will be updated to reflect this.
The ordering of the records in the input files must match the
usage of the -n and -t command-line options. If they do not,
the output order will be undefined. See sort for information about
record ordering.
- -1
- Use Deflate compression level 1 to compress the output.
- -b FILE
- List of input BAM files, one file per line.
- -f
- Force to overwrite the output file if present.
- -h FILE
- Use the lines of FILE as `@' headers to be copied to
out.bam, replacing any header lines that would otherwise be copied
from in1.bam. (FILE is actually in SAM format, though any
alignment records it may contain are ignored.)
- -n
- The input alignments are sorted by read names rather than by chromosomal
coordinates
- -o FILE
- Write merged output to FILE, specifying the filename via an option
rather than as the first filename argument. When -o is used, all
non-option filename arguments specify input files to be merged.
- -t TAG
- The input alignments have been sorted by the value of TAG, then by either
position or name (if -n is given).
- -R STR
- Merge files in the specified region indicated by STR [null]
- -r
- Attach an RG tag to each alignment. The tag value is inferred from file
names.
- -u
- Uncompressed BAM output
- -c
- When several input files contain @RG headers with the same ID, emit only
one of them (namely, the header line from the first file we find that ID
in) to the merged output file. Combining these similar headers is usually
the right thing to do when the files being merged originated from the same
file.
Without -c, all @RG headers appear in the output file,
with random suffixes added to their IDs where necessary to differentiate
them.
- -p
- Similarly, for each @PG ID in the set of files to merge, use the @PG line
of the first file we find that ID in rather than adding a suffix to
differentiate similar IDs.
- -X
- If this option is set, it will allows user to specify customized index
file location(s) if the data folder does not contain any index file. See
EXAMPLES section for sample of usage.
- -L FILE
- BED file for specifying multiple regions on which the merge will be
performed. This option extends the usage of -R option and cannot be
used concurrently with it.
- --no-PG
- Do not add a @PG line to the header of the output file.
- -@, --threads INT
- Number of input/output compression threads to use in addition to main
thread [0].
- o
- Attach the RG tag while merging sorted alignments:
perl -e 'print "@RG\tID:ga\tSM:hs\tLB:ga\tPL:Illumina\n@RG\tID:454\tSM:hs\tLB:454\tPL:454\n"' > rg.txt
samtools merge -rh rg.txt merged.bam ga.bam 454.bam
The value in a RG tag is determined by the file name
the read is coming from. In this example, in the merged.bam,
reads from ga.bam will be attached RG:Z:ga, while reads
from 454.bam will be attached RG:Z:454.
- o
- Include customized index file as a part of arguments:
samtools merge [options] -X <out.bam> </data_folder/in1.bam> [</data_folder/in2.bam> ... </data_folder/inN.bam>] </index_folder/index1.bai> [</index_folder/index2.bai> ... </index_folder/indexN.bai>]
Written by Heng Li from the Sanger Institute.
samtools(1), samtools-sort(1), sam(5)
Samtools website: <http://www.htslib.org/>