new
Note: November 4, 2024
Description
The Genome in a Bottle (GIAB) Problematic Regions tracks provide stratifications of the
genome to evaluate variant calls in complex regions. It is designed for use with Global Alliance
for Genomic Health (GA4GH) benchmarking tools like
hap.py
and includes regions with low complexity, segmental duplications, functional regions,
and difficult-to-sequence areas. Developed in collaboration with GA4GH, the
Genome in a Bottle (GIAB) consortium, and the
Telomere-to-Telomere Consortium (T2T), the dataset aims to standardize the
analysis of genetic variation by offering pre-defined BED files for stratifying true and false
positives in genomic studies, facilitating accurate assessments in complex areas of the genome.
Methods
The creation of the GIAB Problematic Regions tracks involves using a pipeline and configuration to
generate stratification BED files that categorize genomic regions based on specific challenges,
such as low complexity or difficult mapping, to facilitate accurate benchmarking of variant calls.
For more information on the pipeline and configuration used, please visit the following webpage:
https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/genome-stratifications/v3.5/README.md.
Contact
If you have questions or comments, please write to Justin Zook ([email protected]).
References
Dwarshuis N, Kalra D, McDaniel J, Sanio P, Alvarez Jerez P, Jadhav B, Huang WE, Mondal R, Busby B,
Olson ND et al.
The GIAB genomic stratifications resource for human reference genomes.
Nat Commun. 2024 Oct 19;15(1):9029.
PMID: 39424793; PMC: PMC11489684
Krusche P, Trigg L, Boutros PC, Mason CE, De La Vega FM, Moore BL, Gonzalez-Porta M, Eberle MA,
Tezak Z, Lababidi S et al.
Best practices for benchmarking germline small-variant calls in human genomes.
Nat Biotechnol. 2019 May;37(5):555-560.
PMID: 30858580; PMC: PMC6699627
|
|