UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Experimental Evaluation and Development of a Silver-Standard for the MIMIC-III Clinical Coding Dataset

Searle, T; Ibrahim, Z; Dobson, RJB; (2020) Experimental Evaluation and Development of a Silver-Standard for the MIMIC-III Clinical Coding Dataset. In: Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing. (pp. pp. 76-85). Association for Computational Linguistics (ACL): Online. Green open access

[thumbnail of Experimental_Evaluation_and_Development_of_a_Silver_Standard_for_the_MIMIC_III_Clinical_Coding_Dataset.pdf]
Preview
Text
Experimental_Evaluation_and_Development_of_a_Silver_Standard_for_the_MIMIC_III_Clinical_Coding_Dataset.pdf - Accepted Version

Download (265kB) | Preview

Abstract

Clinical coding is currently a labour-intensive, error-prone, but critical administrative process whereby hospital patient episodes are manually assigned codes by qualified staff from large, standardised taxonomic hierarchies of codes. Automating clinical coding has a long history in NLP research and has recently seen novel developments setting new state of the art results. A popular dataset used in this task is MIMIC-III, a large intensive care database that includes clinical free text notes and associated codes. We argue for the reconsideration of the validity MIMIC-III’s assigned codes that are often treated as gold-standard, especially when MIMIC-III has not undergone secondary validation. This work presents an open-source, reproducible experimental methodology for assessing the validity of codes derived from EHR discharge summaries. We exemplify the methodology with MIMIC-III discharge summaries and show the most frequently assigned codes in MIMIC-III are under-coded up to 35%.

Type: Proceedings paper
Title: Experimental Evaluation and Development of a Silver-Standard for the MIMIC-III Clinical Coding Dataset
Event: 19th SIGBioMed Workshop on Biomedical Language Processing (BioNLP)
Location: ELECTR NETWORK
Dates: 09 July 2020
Open access status: An open access version is available from UCL Discovery
DOI: 10.18653/v1/2020.bionlp-1.8
Publisher version: https://www.aclweb.org/anthology/2020.bionlp-1.8/
Language: English
Additional information: ACL materials are Copyright © 1963–2020 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Health Informatics > Clinical Epidemiology
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10111316
Downloads since deposit
16,720Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item