UCL Discovery Stage
UCL home » Library Services » Electronic resources » UCL Discovery Stage

Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach

Wallace, BC; Noel-Storr, A; Marshall, IJ; Cohen, AM; Smalheiser, NR; Thomas, J; (2017) Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach. Journal of the American Medical Informatics Association , 24 (6) pp. 1165-1168. 10.1093/jamia/ocx053. Green open access

[thumbnail of ocx053.pdf]
Preview
Text
ocx053.pdf - Published Version

Download (183kB) | Preview

Abstract

OBJECTIVES: Identifying all published reports of randomized controlled trials (RCTs) is an important aim, but it requires extensive manual effort to separate RCTs from non-RCTs, even using current machine learning (ML) approaches. We aimed tomake this process more efficient via a hybrid approach using both crowdsourcing andML. METHODS: We trained a classifier to discriminate between citations that describe RCTs and those that do not. We then adopted a simple strategy of automatically excluding citations deemed very unlikely to be RCTs by the classifier and deferring to crowdworkers otherwise. RESULTS: Combining ML and crowdsourcing provid es a highly sensitive RCT identification strategy (our estimates suggest 95%-99% recall) with substantially less effort (we observed a reduction of around 60%-80%) than relying on manual screening alone. CONCLUSIONS: Hybrid crowd-ML strategies warrant further exploration for biomedical curation/annotation tasks.

Type: Article
Title: Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach
Open access status: An open access version is available from UCL Discovery
DOI: 10.1093/jamia/ocx053
Publisher version: http://doi.org/10.1093/jamia/ocx053
Language: English
Additional information: © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
Keywords: Machine learning, evidence-based medicine, crowdsourcing, human computation, natural language processing
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Education
UCL > Provost and Vice Provost Offices > School of Education > UCL Institute of Education
UCL > Provost and Vice Provost Offices > School of Education > UCL Institute of Education > IOE - Social Research Institute
URI: https://discovery-pp.ucl.ac.uk/id/eprint/10039588
Downloads since deposit
6,916Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item