RTI uses cookies to offer you the best experience online. By clicking “accept” on this website, you opt in and you agree to the use of cookies. If you would like to know more about how RTI uses cookies and how to manage them please view our Privacy Policy here. You can “opt out” or change your mind by visiting: http://optout.aboutads.info/. Click “accept” to agree.
Random survival forests using linked data to measure illness burden among individuals before or after a cancer diagnosis
Development and internal validation of the SEER-CAHPS illness burden index
Lines, L. M., Cohen, J., Kirschner, J., Halpern, M. T., Kent, E. E., Mollica, M. A., & Smith, A. W. (2021). Random survival forests using linked data to measure illness burden among individuals before or after a cancer diagnosis: Development and internal validation of the SEER-CAHPS illness burden index. International Journal of Medical Informatics, 145, Article 104305. https://doi.org/10.1016/j.ijmedinf.2020.104305
PURPOSE: To develop and internally validate an illness burden index among Medicare beneficiaries before or after a cancer diagnosis.
METHODS: Data source: SEER-CAHPS, linking Surveillance, Epidemiology, and End Results (SEER) cancer registry, Medicare enrollment and claims, and Medicare Consumer Assessment of Healthcare Providers and Systems (Medicare CAHPS) survey data providing self-reported sociodemographic, health, and functional status information. To generate a score for everyone in the dataset, we tabulated 4 groups within each annual subsample (2007-2013): 1) Medicare Advantage (MA) beneficiaries or 2) Medicare fee-for-service (FFS) beneficiaries, surveyed before cancer diagnosis; 3) MA beneficiaries or 4) Medicare FFS beneficiaries surveyed after diagnosis. Random survival forests (RSFs) predicted 12-month all-cause mortality and drew predictor variables (mean per subsample = 44) from 8 domains: sociodemographic, cancer-specific, health status, chronic conditions, healthcare utilization, activity limitations, proxy, and location-based factors. Roughly two-thirds of the sample was held out for algorithm training. Error rates based on the validation ("out-of-bag," OOB) samples reflected the correctly classified percentage. Illness burden scores represented predicted cumulative mortality hazard.
RESULTS: The sample included 116,735 Medicare beneficiaries with cancer, of whom 73 % were surveyed after their cancer diagnosis; overall mean mortality rate in the 12 months after survey response was 6%. SEER-CAHPS Illness Burden Index (SCIBI) scores were positively skewed (median range: 0.29 [MA, pre-diagnosis] to 2.85 [FFS, post-diagnosis]; mean range: 2.08 [MA, pre-diagnosis] to 4.88 [MA, post-diagnosis]). The highest decile of the distribution had a 51 % mortality rate (range: 29-71 %); the bottom decile had a 1% mortality rate (range: 0-2 %). The error rate was 20 % overall (range: 9% [among FFS enrollees surveyed after diagnosis] to 36 % [MA enrollees surveyed before diagnosis]).
CONCLUSIONS: This new morbidity measure for Medicare beneficiaries with cancer may be useful to future SEER-CAHPS users who wish to adjust for comorbidity.