Correcting for selection bias in HIV prevalence estimates: An application of sample selection models using data from population-based HIV surveys in seven sub-Saharan African countries

Rachel Michelle Bray; Anton M. Palma; Giampiero Marra; Rachel Bray; Suzue Saito; Anna Colletar Awor

Correcting for selection bias in HIV prevalence estimates

An application of sample selection models using data from population-based HIV surveys in seven sub-Saharan African countries

Palma, A. M., Marra, G., Bray, R., Saito, S., & Awor, A. C. (2022). Correcting for selection bias in HIV prevalence estimates: An application of sample selection models using data from population-based HIV surveys in seven sub-Saharan African countries. Journal of the International AIDS Society, 25(8), Article e25954. https://doi.org/10.1002/jia2.25954

Copy citation

Abstract

Introduction Population-based biomarker surveys are the gold standard for estimating HIV prevalence but are susceptible to substantial non-participation (up to 30%). Analytical missing data methods, including inverse-probability weighting (IPW) and multiple imputation (MI), are biased when data are missing-not-at-random, for example when people living with HIV more frequently decline participation. Heckman-type selection models can, under certain assumptions, recover unbiased prevalence estimates in such scenarios. Methods We pooled data from 142,706 participants aged 15-49 years from nationally representative cross-sectional Population-based HIV Impact Assessments in seven countries in sub-Saharan Africa, conducted between 2015 and 2018 in Tanzania, Uganda, Malawi, Zambia, Zimbabwe, Lesotho and Eswatini. We compared sex-stratified HIV prevalence estimates from unadjusted, IPW, MI and selection models, controlling for household and individual-level predictors of non-participation, and assessed the sensitivity of selection models to the copula function specifying the correlation between study participation and HIV status. Results In total, 84.1% of participants provided a blood sample to determine HIV serostatus (range: 76% in Malawi to 95% in Uganda). HIV prevalence estimates from selection models diverged from IPW and MI models by up to 5% in Lesotho, without substantial precision loss. In Tanzania, the IPW model yielded lower HIV prevalence estimates among males than the best-fitting copula selection model (3.8% vs. 7.9%). Conclusions We demonstrate how HIV prevalence estimates from selection models can differ from those obtained under missing-at-random assumptions. Further benefits include exploration of plausible relationships between participation and outcome. While selection models require additional assumptions and careful specification, they are an important tool for triangulating prevalence estimates in surveys with substantial missing data due to non-participation.

Recent Publications

Article

The early motor questionnaire facilitates the remote assessment of normative motor development in infancy and toddlerhood

January 01, 2025

Article

Adult vaccination coverage in the United States

December 31, 2024

Article

Outcomes of substance use and sexual power among adolescent girls and young women in Cape Town

December 31, 2024

Article

The impact of violations of expected utility theory on choices in the face of multiple risks

December 01, 2024

View All Publications