|
NAMEPEAK-CLASSIFIER - Classify peaks in a BED file according to features in a GFFSYNOPSISpeak-classifier [--upstream-boundaries pos[,pos...]] \ [--min-peak-overlap x.y] [--min-gff-overlap x.y] [--midpoints] \ peaks.bed features.gff3 overlaps.tsv PURPOSEPeak-classifier classifies all peaks in the given BED file according to features found in the provided GFF. Peaks are typically called from ChIP/ATAC-Seq experiments using tools such as MACS2.OPTIONS
DESCRIPTIONFeatures include all those explicitly named in the GFF as well as introns, which are computed as regions between the given exons, and promoters which are regions just upstream of the TSS.By default, promoter regions are generated for 1-1000 bases, 1001-10000 bases, and 10001-100000 bases upstream from TSS. After generating a BED file containing all GFF features + those generated, bedtools intersect is used to determine the overlaps. All overlaps between peaks and GFF features are reported in the output TSV (tab-separated values) file. In many cases, a peak may overlap two or more adjacent features, in which case one line of output is generated for each overlap. The output file contains the location of the peak in the first three columns, followed by the location, name, and strand of the GFF feature, and finally the number of bases of overlap between the two. The file does not conform to any standard format, though the first three columns follow BED file format and the 4th and 5th columns use BED coordinates (0-based, end coordinate is 1 past the last base in the feature). #Chr P-start P-end F-start F-end F-name Strand Overlap 1 3119722 3120223 3043475 3133475 upstream100000 + 501 1 3119722 3120223 3072238 3162238 upstream100000 + 501 1 3121255 3121756 3043475 3133475 upstream100000 + 501 1 3121255 3121756 3072238 3162238 upstream100000 + 501 1 3167069 3167570 3162238 3171238 upstream10000 + 501 1 3203860 3204361 -1 -1 intergenic . 501 1 3292373 3293369 3222979 3312979 upstream100000 + 996 1 3292373 3293369 3276123 3741721 gene - 996 1 3292373 3293369 3284704 3741721 mRNA - 996 1 3292373 3293369 3287191 3491924 intron - 996 1 3297187 3297998 3222979 3312979 upstream100000 + 811 Output can be further processed by filter-overlaps(1) to gather information on features of interest. SEE ALSOfilter-overlaps(1), feature-view(1), bedtools, MACS2, DESeq2BUGSPlease report bugs to the author and send patches in unified diff format. (man diff for more information)AUTHORJ. Bacon Visit the GSP FreeBSD Man Page Interface. |