RTI uses cookies to offer you the best experience online. By clicking “accept” on this website, you opt in and you agree to the use of cookies. If you would like to know more about how RTI uses cookies and how to manage them please view our Privacy Policy here. You can “opt out” or change your mind by visiting: http://optout.aboutads.info/. Click “accept” to agree.
Sparse-data bias accompanying overly fine stratification in an analysis of beryllium exposure and lung cancer risk
Rothman, K., & Mosquin, P. (2013). Sparse-data bias accompanying overly fine stratification in an analysis of beryllium exposure and lung cancer risk. Annals of Epidemiology, 23(2), 43-48. https://doi.org/10.1016/j.annepidem.2012.11.005
Purpose Beryllium's classification as a carcinogen is based on limited human data that show inconsistent associations with lung cancer. Therefore, a thorough examination of those data is warranted. We reanalyzed data from the largest study of occupational beryllium exposure, conducted by the National Institute of Occupational Safety and Health (NIOSH).
Methods Data had been analyzed using stratification and standardization. We reviewed the strata in the original analysis, and reanalyzed using fewer strata. We also fit a Poisson regression, and analyzed simulated datasets that generated lung cancer cases randomly without regard to exposure.
Results The strongest association reported in the NIOSH study, a standardized rate ratio for death from lung cancer of 3.68 for the highest versus lowest category of time since first employment, is affected by sparse-data bias, stemming from stratifying 545 lung cancer cases and their associated person-time into 1792 categories. For time since first employment, the measure of beryllium exposure with the strongest reported association with lung cancer, there were no strata without zeroes in at least one of the two contrasting exposure categories. Reanalysis using fewer strata or with regression models gave substantially smaller effect estimates. Simulations confirmed that the original stratified analysis was upwardly biased. Other metrics used in the NIOSH study found weaker associations and were less affected by sparse-data bias.
Conclusions The strongest association reported in the NIOSH study seems to be biased as a result of non-overlap of data across the numerous strata. Simulation results indicate that most of the effect reported in the NIOSH paper for time since first employment is attributable to sparse-data bias.