RTI uses cookies to offer you the best experience online. By clicking “accept” on this website, you opt in and you agree to the use of cookies. If you would like to know more about how RTI uses cookies and how to manage them please view our Privacy Policy here. You can “opt out” or change your mind by visiting: http://optout.aboutads.info/. Click “accept” to agree.
Content analysis of social determinants of health accelerator plans using artificial intelligence
A use case for public health practitioners
DePriest, K., Feher, J., Gore, K., Glasgow, L., Grant, C., Holtgrave, P., Hacker, K., & Chew, R. (2025). Content analysis of social determinants of health accelerator plans using artificial intelligence: A use case for public health practitioners. Journal of Public Health Management and Practice. Advance online publication. https://doi.org/10.1097/PHH.0000000000002148
CONTEXT: Public health practice involves the development of reports and plans, including funding progress reports, strategic plans, and community needs assessments. These documents are valuable data sources for program monitoring and evaluation. However, practitioners rarely have the bandwidth to thoroughly and rapidly review large amounts of primarily qualitative data to support real-time and continuous program improvement. Systematically examining and categorizing qualitative data through content analysis is labor-intensive. Large language models (LLMs), a type of generative artificial intelligence (AI) focused on language-based tasks, hold promise for expediting content analysis of public health documents, which, in turn, could facilitate continuous program improvement.
OBJECTIVES: To explore the feasibility and potential of using LLMs to expedite content analysis of real-world public health documents. The focus was on comparing semiautomated outputs from GPT-4o with human outputs for abstracting and synthesizing information from health improvement plans.
DESIGN: Our study team conducted a content analysis of 4 publicly available community health improvement plans and compared the results with GPT-4o's performance on 20 data elements. We also assessed the resources required for both methods, including time spent on prompt engineering and error correction.
MAIN OUTCOME MEASURES: Accuracy of data abstraction and time required.
RESULTS: GPT-4o demonstrated abstraction accuracy of 79% (n = 17 errors) compared to 94% accuracy by the study team for individual plans, with 8 instances of falsified data. Out of the 18 synthesis data elements, GPT-4o made 9 errors, demonstrating an accuracy of 50%. On average, GPT-4o abstraction required fewer hours than study team abstraction, but resource savings diminished when accounting for time for developing prompts and identifying/correcting errors.
CONCLUSIONS: Public health professionals who explore the use of generative AI tools should approach the method with cautious curiosity and consider the potential tradeoffs between resource savings and data accuracy.
RTI shares its evidence-based research - through peer-reviewed publications and media - to ensure that it is accessible for others to build on, in line with our mission and scientific standards.