UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Evaluation of methods for estimating coalescence times using ancestral recombination graphs

Y C Brandt, Débora; Wei, Xinzhu; Deng, Yun; Vaughn, Andrew H; Nielsen, Rasmus; (2022) Evaluation of methods for estimating coalescence times using ancestral recombination graphs. Genetics , 221 (1) , Article iyac044. 10.1093/genetics/iyac044. Green open access

[thumbnail of final_Evaluation_of_methods_for_inference_of_ancestral_recombination_graphs.pdf]
Preview
Text
final_Evaluation_of_methods_for_inference_of_ancestral_recombination_graphs.pdf - Accepted Version

Download (15MB) | Preview

Abstract

The ancestral recombination graph is a structure that describes the joint genealogies of sampled DNA sequences along the genome. Recent computational methods have made impressive progress toward scalably estimating whole-genome genealogies. In addition to inferring the ancestral recombination graph, some of these methods can also provide ancestral recombination graphs sampled from a defined posterior distribution. Obtaining good samples of ancestral recombination graphs is crucial for quantifying statistical uncertainty and for estimating population genetic parameters such as effective population size, mutation rate, and allele age. Here, we use standard neutral coalescent simulations to benchmark the estimates of pairwise coalescence times from 3 popular ancestral recombination graph inference programs: ARGweaver, Relate, and tsinfer+tsdate. We compare (1) the true coalescence times to the inferred times at each locus; (2) the distribution of coalescence times across all loci to the expected exponential distribution; (3) whether the sampled coalescence times have the properties expected of a valid posterior distribution. We find that inferred coalescence times at each locus are most accurate in ARGweaver, and often more accurate in Relate than in tsinfer+tsdate. However, all 3 methods tend to overestimate small coalescence times and underestimate large ones. Lastly, the posterior distribution of ARGweaver is closer to the expected posterior distribution than Relate's, but this higher accuracy comes at a substantial trade-off in scalability. The best choice of method will depend on the number and length of input sequences and on the goal of downstream analyses, and we provide guidelines for the best practices.

Type: Article
Title: Evaluation of methods for estimating coalescence times using ancestral recombination graphs
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1093/genetics/iyac044
Publisher version: https://doi.org/10.1093/genetics/iyac044
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: ARGweaver, Relate, ancestral recombination graph, calibration, simulation, tsdate, tsinfer, Algorithms, Alleles, Genetics, Population, Models, Genetic, Population Density, Recombination, Genetic
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10164953
Downloads since deposit
2,432Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item