Citation screening using crowdsourcing and machine learning produced accurate results: Evaluation of Cochrane's modified Screen4Me service

Gerald Gartlehner; Anna Noel-Storr; Gordon Dooley; Lisa Affengruber; Gerald Gartlehner

Citation screening using crowdsourcing and machine learning produced accurate results

Evaluation of Cochrane's modified Screen4Me service

Noel-Storr, A., Dooley, G., Affengruber, L., & Gartlehner, G. (2021). Citation screening using crowdsourcing and machine learning produced accurate results: Evaluation of Cochrane's modified Screen4Me service. Journal of Clinical Epidemiology, 130, 23-31. https://doi.org/10.1016/j.jclinepi.2020.09.024

Copy citation

Abstract

Objectives: To assess the feasibility of a modified workflow that uses machine learning and crowdsourcing to identify studies for potential inclusion in a systematic review.Study Design and Setting: This was a substudy to a larger randomized study; the main study sought to assess the performance of single screening search results versus dual screening. This substudy assessed the performance in identifying relevant randomized controlled trials (RCTs) for a published Cochrane review of a modified version of Cochrane's Screen4Me workflow which uses crowdsourcing and machine learning. We included participants who had signed up for the main study but who were not eligible to be randomized to the two main arms of that study. The records were put through the modified workflow where a machine learning classifier divided the data set into "Not RCTs"and "Possible RCTs."The records deemed "Possible RCTs"were then loaded into a task created on the Cochrane Crowd platform, and participants classified those records as either "Potentially relevant"or "Not relevant"to the review. Using a prespecified agreement algorithm, we calculated the performance of the crowd in correctly identifying the studies that were included in the review (sensitivity) and correctly rejecting those that were not included (specificity).Results: The RCT machine learning classifier did not reject any of the included studies. In terms of the crowd, 112 participants were included in this substudy. Of these, 81 completed the training module and went on to screen records in the live task. Applying the Cochrane Crowd agreement algorithm, the crowd achieved 100% sensitivity and 80.71% specificity.Conclusions: Using a crowd to screen search results for systematic reviews can be an accurate method as long as the agreement algorithm in place is robust.Trial registration: Open Science Framework: https://osf.io/3jyqt. (c) 2020 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Publications Info

To contact an RTI author, request a report, or for additional information about publications by our experts, send us your request.

publications@rti.org

RTI shares its evidence-based research - through peer-reviewed publications and media - to ensure that it is accessible for others to build on, in line with our mission and scientific standards.

Meet the Experts

Navigate to Gerald Gartlehner

Gerald Gartlehner

Recent Publications

Article

US consumer and healthcare professional preferences for combination COVID-19 and influenza vaccines

December 31, 2025

Article

Plain language summary of mortality rates of patients with Parkinson’s disease psychosis who were treated either with pimavanserin or with different second-generation (atypical) antipsychotics

December 31, 2025

Article

A comprehensive GPS-based analysis of activity spaces in early and late pregnancy using the ActMAP framework

December 01, 2025

Article

IPECAD modeling workshop 2023 cross comparison challenge on cost-effectiveness models in Alzheimer's disease

April 01, 2025

View All Publications