RTI uses cookies to offer you the best experience online. By clicking “accept” on this website, you opt in and you agree to the use of cookies. If you would like to know more about how RTI uses cookies and how to manage them please view our Privacy Policy here. You can “opt out” or change your mind by visiting: http://optout.aboutads.info/. Click “accept” to agree.
Analysis of Large Health Surveys: Accounting for the Sampling Design
Korn, EL., & Graubard, BI. (1995). Analysis of Large Health Surveys: Accounting for the Sampling Design. Journal of the Royal Statistical Society. Series A (Statistics in Society), 158(2), 263-295.
Large scale health surveys offer an opportunity to study associations between risk factors and outcomes in a population-based setting. Their complicated multistage sampling designs with differential probabilities of sampling individuals can make their analysis unstraightforward. Classical 'design-based' methods that yield approximately unbiased estimators of associations and standard errors can be highly inefficient. Model-based methods require assumptions which, if wrong, can lead to biased estimators of associations and standard errors. This paper examines the implications of utilizing the sample clustering and sample weights in the analysis of survey data. The approach is to estimate the inefficiency of using these aspects of the sampling design in a design-based analysis when actually it was unnecessary to do so. If the inefficiency is small, then that aspect of the design is used in a design-based fashion. Otherwise, additional modelling assumptions are incorporated into the analysis. By focusing attention on risk factor-outcome associations in large health surveys, specific recommendations for practitioners are given. The issues are demonstrated with real survey data including two controversial analyses previously published in medical references