#include <biolibc/fastq.h>
-lbiolibc -lxtend
size_t bl_fastq_find_3p_low_qual(const bl_fastq_t *read, unsigned min_qual,
unsigned phred_base)
read FASTQ read to be searched
min_qual Minimum quality of bases to keep
phred_base Offset into the ISO character set used by PHRED scores
(usually 33 for modern data)
Locate start of a low-quality 3' end in a FASTQ read. This function uses the
same algorithm as fastq and cutadapt as of the time of writing. Namely, it
starts at the 3' end of the quality string and sums (base quality - minimum
quality) while moving in the 5' direction. This sum will be < 0 as long as
the average base quality is < minimum quality. It also keeps track of where
the minimum of this sum occurs. When the sum become > 0, we have reached a
point where the average quality of the 3' end is satisfactory, and it is
assumed it will remain that way if we continue in the 5' direction. ( Illumina
reads tend to drop in quality near the 3' end. ) The location of the minimum
sum is then returned, since the average quality of everything in the 5'
direction must be satisfactory.
Index of first low-quality base at the 3' end if found, index of NULL terminator
otherwise
bl_fastq_t read;
index = bl_fastq_find_3p_low_qual(&read, 20, 33);
bl_fastq_3p_trim(&read, index);
bl_fastq_find_adapter_smart(3), bl_fastq_find_adapter_exact(3), bl_fastq_trim(3)