Veeravalli, Rini Suchitra Sundari;
(2024)
Data-Driven Exploration of Phenotypes in UK Electronic Health Records for Symptom Identification and Diagnosis of Rare or Monogenic Disease.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
Veeravalli__thesis.pdf - Other Download (6MB) | Preview |
Abstract
This research aims to explore the use of electronic health record data to improve the identification and diagnosis of rare or single-gene diseases in UK clinical practice. Phenotype data was analysed from two national databases, from both primary and secondary care: the IQVIA Medical Research Database (IMRD), and the Genomics England 100,000 Genomes Project, respectively. The first study evaluated the performance of a novel Phenotype Risk Score (PheRS) approach to identifying patients of a range of rare diseases within UK secondary care. Overall, low prevalence of rare diseases resulted in low positive predictive value of the PheRS, resulting in higher false-positive rates. To better understand earlier clinical diagnosis, the second study estimated the prevalence of several monogenic diseases within UK primary care. Prevalence was comparable but slightly underestimated compared to European and worldwide estimates. Estimated age at first diagnosis indicated scope for earlier diagnosis. Capturing diagnoses in research is affected by data quality and symptoms extracted from existing knowledge, which can be sparse for rare, monogenic diseases. Therefore, the third study performed a data-driven exploration of the most important features leading up to presence or absence of diagnosis in the IMRD for one example disease: Gilbert’s Syndrome (GS). Supervised machine learning feature selection identified jaundice symptoms as most important, which is consistent with clinical criteria for diagnosis. To investigate the direction of association of several important features, the fourth study estimated new symptom occurrence and time from symptom onset to diagnosis. Abdominal pain and fatigue appeared to be recorded symptoms of GS and, although they are non-specific to GS, this may suggest potential usefulness of more common features to prompt earlier diagnosis when appearing in addition to more GS-specific, but uncommon features. Diagnostic delays were observed by sex. Further work to develop and validate findings may empower earlier identification of monogenic diseases in the UK using data-driven phenotype-based methods.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Data-Driven Exploration of Phenotypes in UK Electronic Health Records for Symptom Identification and Diagnosis of Rare or Monogenic Disease |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2024. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Epidemiology and Health |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10189024 |
Archive Staff Only
![]() |
View Item |