UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Phylogenetic approaches to identifying fragments of the same gene, with application to the wheat genome

Piližota, I; Train, C-M; Altenhoff, A; Redestig, H; Dessimoz, C; (2018) Phylogenetic approaches to identifying fragments of the same gene, with application to the wheat genome. Bioinformatics , Article bty772. 10.1093/bioinformatics/bty772. Green open access

[thumbnail of bty772.pdf]
Preview
Text
bty772.pdf - Published Version

Download (501kB) | Preview

Abstract

Motivation: As the time and cost of sequencing decrease, the number of available genomes and transcriptomes rapidly increases. Yet the quality of the assemblies and the gene annotations varies considerably and often remains poor, affecting downstream analyses. This is particularly true when fragments of the same gene are annotated as distinct genes, which may cause them to be mistaken as paralogs. Results: In this study, we introduce two novel phylogenetic tests to infer non-overlapping or partially overlapping genes that are in fact parts of the same gene. One approach collapses branches with low bootstrap support and the other computes a likelihood ratio test. We extensively validated these methods by 1) introducing and recovering fragmentation on the bread wheat, Triticum aestivum cv. Chinese Spring, chromosome 3B; 2) by applying the methods to the low-quality 3B assembly and validating predictions against the high-quality 3B assembly; and 3) by comparing the performance of the proposed methods to the performance of existing methods, namely Ensembl Compara and ESPRIT. Application of this combination to a draft shotgun assembly of the entire bread wheat genome revealed 1221 pairs of genes that are highly likely to be fragments of the same gene. Our approach demonstrates the power of fine-grained evolutionary inferences across multiple species to improving genome assemblies and annotations. Availability: An open source software tool is available at https://github.com/DessimozLab/esprit2. Supplementary information: Supplementary data are available at Bioinformatics online.

Type: Article
Title: Phylogenetic approaches to identifying fragments of the same gene, with application to the wheat genome
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.1093/bioinformatics/bty772
Publisher version: https://doi.org/10.1093/bioinformatics/bty772
Language: English
Additional information: This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences > Genetics, Evolution and Environment
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10057105
Downloads since deposit
5,472Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item