UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

HIV-1 Full-Genome Phylogenetics of Generalized Epidemics in Sub-Saharan Africa: Impact of Missing Nucleotide Characters in Next-Generation Sequences

Ratmann, O; Wymant, C; Colijn, C; Danaviah, S; Essex, M; Frost, S; Gall, A; ... Fraser, C; + view all (2017) HIV-1 Full-Genome Phylogenetics of Generalized Epidemics in Sub-Saharan Africa: Impact of Missing Nucleotide Characters in Next-Generation Sequences. AIDS Research and Human Retroviruses , 33 (11) pp. 1083-1098. 10.1089/aid.2017.0061. Green open access

[thumbnail of aid.2017.0061.pdf]
Preview
Text
aid.2017.0061.pdf - Published Version

Download (1MB) | Preview

Abstract

To characterize HIV-1 transmission dynamics in regions where the burden of HIV-1 is greatest, the “Phylogenetics and Networks for Generalised HIV Epidemics in Africa” consortium (PANGEA-HIV) is sequencing full-genome viral isolates from across sub-Saharan Africa. We report the first 3,985 PANGEA-HIV consensus sequences from four cohort sites (Rakai Community Cohort Study, n = 2,833; MRC/UVRI Uganda, n = 701; Mochudi Prevention Project, n = 359; Africa Health Research Institute Resistance Cohort, n = 92). Next-generation sequencing success rates varied: more than 80% of the viral genome from the gag to the nef genes could be determined for all sequences from South Africa, 75% of sequences from Mochudi, 60% of sequences from MRC/UVRI Uganda, and 22% of sequences from Rakai. Partial sequencing failure was primarily associated with low viral load, increased for amplicons closer to the 3′ end of the genome, was not associated with subtype diversity except HIV-1 subtype D, and remained significantly associated with sampling location after controlling for other factors. We assessed the impact of the missing data patterns in PANGEA-HIV sequences on phylogeny reconstruction in simulations. We found a threshold in terms of taxon sampling below which the patchy distribution of missing characters in next-generation sequences (NGS) has an excess negative impact on the accuracy of HIV-1 phylogeny reconstruction, which is attributable to tree reconstruction artifacts that accumulate when branches in viral trees are long. The large number of PANGEA-HIV sequences provides unprecedented opportunities for evaluating HIV-1 transmission dynamics across sub-Saharan Africa and identifying prevention opportunities. Molecular epidemiological analyses of these data must proceed cautiously because sequence sampling remains below the identified threshold and a considerable negative impact of missing characters on phylogeny reconstruction is expected.

Type: Article
Title: HIV-1 Full-Genome Phylogenetics of Generalized Epidemics in Sub-Saharan Africa: Impact of Missing Nucleotide Characters in Next-Generation Sequences
Open access status: An open access version is available from UCL Discovery
DOI: 10.1089/aid.2017.0061
Publisher version: http://dx.doi.org/10.1089/aid.2017.0061
Language: English
Additional information: © Oliver Ratmann et al. 2017; Published by Mary Ann Liebert, Inc. This is an Open Access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Keywords: Science & Technology, Life Sciences & Biomedicine, Immunology, Infectious Diseases, Virology, human immunodeficiency virus, phylogenomics, phylodynamics, HIV-1 molecular epidemiology, sub-Saharan Africa, PANGEA, MAXIMUM-LIKELIHOOD PHYLOGENIES, FISHING COMMUNITIES, INFECTIOUS-DISEASE, LAKE VICTORIA, UGANDA, TRANSMISSION, RISK, TREE, RECONSTRUCTION, INCONGRUENCE
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Medical Sciences > Div of Infection and Immunity
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10038704
Downloads since deposit
6,478Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item