Wallace, BC;
Noel-Storr, A;
Marshall, IJ;
Cohen, AM;
Smalheiser, NR;
Thomas, J;
(2017)
Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach.
Journal of the American Medical Informatics Association
, 24
(6)
pp. 1165-1168.
10.1093/jamia/ocx053.
Preview |
Text
ocx053.pdf - Published Version Download (183kB) | Preview |
Abstract
OBJECTIVES: Identifying all published reports of randomized controlled trials (RCTs) is an important aim, but it requires extensive manual effort to separate RCTs from non-RCTs, even using current machine learning (ML) approaches. We aimed tomake this process more efficient via a hybrid approach using both crowdsourcing andML. METHODS: We trained a classifier to discriminate between citations that describe RCTs and those that do not. We then adopted a simple strategy of automatically excluding citations deemed very unlikely to be RCTs by the classifier and deferring to crowdworkers otherwise. RESULTS: Combining ML and crowdsourcing provid es a highly sensitive RCT identification strategy (our estimates suggest 95%-99% recall) with substantially less effort (we observed a reduction of around 60%-80%) than relying on manual screening alone. CONCLUSIONS: Hybrid crowd-ML strategies warrant further exploration for biomedical curation/annotation tasks.
Type: | Article |
---|---|
Title: | Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1093/jamia/ocx053 |
Publisher version: | http://doi.org/10.1093/jamia/ocx053 |
Language: | English |
Additional information: | © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
Keywords: | Machine learning, evidence-based medicine, crowdsourcing, human computation, natural language processing |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Education UCL > Provost and Vice Provost Offices > School of Education > UCL Institute of Education UCL > Provost and Vice Provost Offices > School of Education > UCL Institute of Education > IOE - Social Research Institute |
URI: | https://discovery-pp.ucl.ac.uk/id/eprint/10039588 |
Archive Staff Only
View Item |