Heyl, Johannes;
Hardy, Flavien;
Tucker, Katie;
Hopper, Adrian;
Marcha, Maria J;
Liew, Ashley;
Reep, Judith;
... Gray, William K; + view all
(2023)
Data quality and autism: Issues and potential impacts.
International Journal of Medical Informatics
, 170
, Article 104938. 10.1016/j.ijmedinf.2022.104938.
Preview |
Text
1-s2.0-S1386505622002520-main.pdf - Published Version Download (952kB) | Preview |
Abstract
INTRODUCTION: Large healthcare datasets can provide insight that has the potential to improve outcomes for patients. However, it is important to understand the strengths and limitations of such datasets so that the insights they provide are accurate and useful. The aim of this study was to identify data inconsistencies within the Hospital Episodes Statistics (HES) dataset for autistic patients and assess potential biases introduced through these inconsistencies and their impact on patient outcomes. The study can only identify inconsistencies in recording of autism diagnosis and not whether the inclusion or exclusion of the autism diagnosis is the error. METHODS: Data were extracted from the HES database for the period 1st April 2013 to 31st March 2021 for patients with a diagnosis of autism. First spells in hospital during the study period were identified for each patient and these were linked to any subsequent spell in hospital for the same patient. Data inconsistencies were recorded where autism was not recorded as a diagnosis in a subsequent spell. Features associated with data inconsistencies were identified using a random forest classifiers and regression modelling. RESULTS: Data were available for 172,324 unique patients who had been recorded as having an autism diagnosis on first admission. In total, 43.7 % of subsequent spells were found to have inconsistencies. The features most strongly associated with inconsistencies included greater age, greater deprivation, longer time since the first spell, change in provider, shorter length of stay, being female and a change in the main specialty description. The random forest algorithm had an area under the receiver operating characteristic curve of 0.864 (95 % CI [0.862 – 0.866]) in predicting a data inconsistency. For patients who died in hospital, inconsistencies in their final spell were significantly associated with being 80 years and over, being female, greater deprivation and use of a palliative care code in the death spell. CONCLUSIONS: Data inconsistencies in the HES database were relatively common in autistic patients and were associated a number of patient and hospital admission characteristics. Such inconsistencies have the potential to distort our understanding of service use in key demographic groups.
Type: | Article |
---|---|
Title: | Data quality and autism: Issues and potential impacts |
Location: | Ireland |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1016/j.ijmedinf.2022.104938 |
Publisher version: | https://doi.org/10.1016/j.ijmedinf.2022.104938 |
Language: | English |
Additional information: | © 2022 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Physics and Astronomy |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10162355 |
Archive Staff Only
![]() |
View Item |