RTI uses cookies to offer you the best experience online. By clicking “accept” on this website, you opt in and you agree to the use of cookies. If you would like to know more about how RTI uses cookies and how to manage them please view our Privacy Policy here. You can “opt out” or change your mind by visiting: http://optout.aboutads.info/. Click “accept” to agree.
Uncertainty estimation for pseudo-Bayesian inference under complex sampling
Williams, M. R., & Savitsky, T. D. (2021). Uncertainty estimation for pseudo-Bayesian inference under complex sampling. International Statistical Review, 89(1), 72-107. https://doi.org/10.1111/insr.12376
Social and economic studies are often implemented as complex survey designs. For example, multistage, unequal probability sampling designs utilised by federal statistical agencies are typically constructed to maximise the efficiency of the target domain level estimator (e.g. indexed by geographic area) within cost constraints for survey administration. Such designs may induce dependence between the sampled units; for example, with employment of a sampling step that selects geographically indexed clusters of units. A sampling-weighted pseudo-posterior distribution may be used to estimate the population model on the observed sample. The dependence induced between coclustered units inflates the scale of the resulting pseudo-posterior covariance matrix that has been shown to induce under coverage of the credibility sets. By bridging results across Bayesian model misspecification and survey sampling, we demonstrate that the scale and shape of the asymptotic distributions are different between each of the pseudo-maximum likelihood estimate (MLE), the pseudo-posterior and the MLE under simple random sampling. Through insights from survey-sampling variance estimation and recent advances in computational methods, we devise a correction applied as a simple and fast postprocessing step to Markov chain Monte Carlo draws of the pseudo-posterior distribution. This adjustment projects the pseudo-posterior covariance matrix such that the nominal coverage is approximately achieved. We make an application to the National Survey on Drug Use and Health as a motivating example and we demonstrate the efficacy of our scale and shape projection procedure on synthetic data on several common archetypes of survey designs.