Blake, Nathan;
(2023)
Machine Learning Applied to Raman Spectroscopy to Classify Cancers.
Doctoral thesis (Ph.D), UCL (University College London).
Preview |
Text
NB_Thesis_Final.pdf - Accepted Version Download (35MB) | Preview |
Abstract
Cancer diagnosis is notoriously difficult, evident in the inter-rater variability between histopathologists classifying cancerous sub-types. Although there are many cancer pathologies, they have in common that earlier diagnosis would maximise treatment potential. To reduce this variability and expedite diagnosis, there has been a drive to arm histopathologists with additional tools. One such tool is Raman spectroscopy, which has demonstrated potential in distinguishing between various cancer types. However, Raman data has high dimensionality and often contains artefacts and together with challenges inherent to medical data, classification attempts can be frustrated. Deep learning has recently emerged with the promise of unlocking many complex datasets, but it is not clear how this modelling paradigm can best exploit Raman data for cancer diagnosis. Three Raman oncology datasets (from ovarian, colonic and oesophageal tissue) were used to examine various methodological challenges to machine learning applied to Raman data, in conjunction with a thorough review of the recent literature. The performance of each dataset is assessed with two traditional and one deep learning models. A technique is then applied to the deep learning model to aid interpretability and relate biochemical antecedents to disease classes. In addition, a clinical problem for each dataset was addressed, including the transferability of models developed using multi-centre Raman data taken different on spectrometers of the same make. Many subtleties of data processing were found to be important to the realistic assessment of a machine learning models. In particular, appropriate cross-validation during hyperparameter selection, splitting data into training and test sets according to the inherent structure of biomedical data and addressing the number of samples Abstract " per disease class are all found to be important factors. Additionally, it was found that instrument correction was not needed to ensure system transferability if Raman data is collected with a common protocol on spectrometers of the same make.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Machine Learning Applied to Raman Spectroscopy to Classify Cancers |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Copyright © The Author 2023. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request. |
Keywords: | Raman Spectroscopy, Machine Learning, Cancer Diagnostics |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Div of Biosciences |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10174644 |




Archive Staff Only
![]() |
View Item |