#include <biolibc/vcf.h>
-lbiolibc -lxtend
void bl_vcf_get_sample_ids(FILE *vcf_stream, char *sample_ids[], size_t
first_col, size_t last_col)
vcf_stream FILE pointer to the VCF input stream
sample_ids Array if character pointers to receive sample IDs
first_col First column from which a sample ID should be saved
last_col Last column from which a sample ID should be saved
Extract sample IDs from a VCF input header line. This is typically done
following bl_vcf_skip_header(3), which will leave the FILE pointer pointing to
the beginning of the header line, if one is present.
The arguments first_col and last_col represent the first and last
sample columns, both inclusive, from which sample IDs should be extracted. A
value of 1 represents the first sample column. This feature allows a VCF
file with many columns to be processed in multiple stages. For example, the
vcf-split tool, based on biolibc, cannot efficiently process more than abou1
10,000 samples at once, since each sample requires an open output file. A
VCF with 150,000 samples can be processed in 15 separate passes.