Data extraction for evidence synthesis using a large language model: A proof-of-concept study

Gerald Gartlehner; Leila Kahwati; Rainer Hilscher; Ian Thomas; Shannon Kugley; Karen Crotty; Meera Viswanathan; Barbara Nussbaumer-Streit; Graham Booth; Nathaniel Erskine; Amanda Konet; Robert Chew

Data extraction for evidence synthesis using a large language model

A proof-of-concept study

Gartlehner, G., Kahwati, L., Hilscher, R., Thomas, I., Kugley, S., Crotty, K., Viswanathan, M., Nussbaumer-Streit, B., Booth, G., Erskine, N., Konet, A., & Chew, R. (2024). Data extraction for evidence synthesis using a large language model: A proof-of-concept study. Research Synthesis Methods, 15(4), 576-589. https://doi.org/10.1002/jrsm.1710

Copy citation

Abstract

Data extraction is a crucial, yet labor-intensive and error-prone part of evidence synthesis. To date, efforts to harness machine learning for enhancing efficiency of the data extraction process have fallen short of achieving sufficient accuracy and usability. With the release of large language models (LLMs), new possibilities have emerged to increase efficiency and accuracy of data extraction for evidence synthesis. The objective of this proof-of-concept study was to assess the performance of an LLM (Claude 2) in extracting data elements from published studies, compared with human data extraction as employed in systematic reviews. Our analysis utilized a convenience sample of 10 English-language, open-access publications of randomized controlled trials included in a single systematic review. We selected 16 distinct types of data, posing varying degrees of difficulty (160 data elements across 10 studies). We used the browser version of Claude 2 to upload the portable document format of each publication and then prompted the model for each data element. Across 160 data elements, Claude 2 demonstrated an overall accuracy of 96.3% with a high test-retest reliability (replication 1: 96.9%; replication 2: 95.0% accuracy). Overall, Claude 2 made 6 errors on 160 data items. The most common errors (n = 4) were missed data items. Importantly, Claude 2's ease of use was high; it required no technical expertise or labeled training data for effective operation (i.e., zero-shot learning). Based on findings of our proof-of-concept study, leveraging LLMs has the potential to substantially enhance the efficiency and accuracy of data extraction for evidence syntheses.

Publications Info

To contact an RTI author, request a report, or for additional information about publications by our experts, send us your request.

publications@rti.org

RTI shares its evidence-based research - through peer-reviewed publications and media - to ensure that it is accessible for others to build on, in line with our mission and scientific standards.

Meet the Experts

Navigate to Robert Chew

Rainer Hilscher

Recent Publications

Article

Patient-reported outcome improvements following scalp hair regrowth among patients with Alopecia Areata: analysis of the ALLEGRO-2b/3 trial

December 2025

Article

Plain language summary of mortality rates of patients with Parkinson’s disease psychosis who were treated either with pimavanserin or with different second-generation (atypical) antipsychotics

December 2025

Article

Biological parenthood rates among men with sickle cell disease

December 2025

Article

Patterns of felt stigma among rural-dwelling people who use drugs: A latent class analysis

December 2025

Article

One voice and vision: How the RISE network built a collective identity as the foundation for strategic dissemination

December 2025

Article

Estimating community-level prevalence of opioid use disorder: Extrapolating from Medicaid claims data and other publicly available data sources in Ohio, USA

December 2025

Article

Experiences of parents who receive a false-positive CK-MM screening for their newborn

December 2025

Article

Evaluating the efficacy and safety of milrinone for prevention of post-patent ductus arteriosus closure syndrome (the MIDAS trial) in extremely preterm infants: A multicentre, double-masked, randomised, placebo-controlled trial

December 2025

View All Publications