RTI uses cookies to offer you the best experience online. By clicking “accept” on this website, you opt in and you agree to the use of cookies. If you would like to know more about how RTI uses cookies and how to manage them please view our Privacy Policy here. You can “opt out” or change your mind by visiting: http://optout.aboutads.info/. Click “accept” to agree.
Big data, big results: Knowledge discovery in output from large-scale analytics
Special Issue
McCormick, TH., Ferrell, R., Karr, A., & Ryan, PB. (2014). Big data, big results: Knowledge discovery in output from large-scale analytics: Special Issue. Statistical Analysis and Data Mining, 7(5), 404-412. https://doi.org/10.1002/sam.11237
Observational healthcare data, such as electronic health records and administrative claims databases, provide longitudinal clinical information at the individual level. These data cover tens of millions of patients and present unprecedented opportunities to address such issues as post-market safety of medical products. Analyzing patient-level databases yields population-level inferences, or 'results', such as the strength of association between medical product exposure and subsequent outcomes, often with thousands of drugs and outcomes. In this article, by contrast, we study 'big results', which are the product of applying thousands of alternative analysis strategies to five large patient databases. These results were produced by the Observational Medical Outcomes Partnership. All together, there are more than 6 million results, comprising risk assessments for 399 medical product-outcome pairs analyzed across five observational databases using seven statistical methods, each of which has between a few dozen and a few hundred variants representing parameters or 'tuning variables'. We focus on the value of knowledge discovery methods and the challenges in extracting clinically relevant knowledge from big results. We believe our analyses are both scientifically and methodologically valuable as they reveal information about how methods/algorithms perform under various circumstances, as well as provide a basis for comparison of these methods.