Identifying datasets for cross-study analysis in dbGaP using PhenX

Michael J. Phillips; Ying Qin; Vesselina D. Bakalov; Huaqin Pan; Carol Marie Hamilton; Bruce Wayne Huggins; Lisa Ann Cox; Stephen Walter Erickson; Yuelong John Guo; Stephen Youngjae Hwang; Michelle Christine Krzyzanowski; David Nelson Williams; Michelle Leigh Ann Engle; Helen Pan; Vesselina D. Bakalov; Lisa Ann Cox; Michelle Engle; Stephen Walter Erickson; Michael Feolo; Yuelong John Guo; Bruce Wayne Huggins; Stephen Youngjae Hwang; Masato  Kimura; Michelle Christine Krzyzanowski; Josh Levy; Michael J. Phillips; Ying Qin; David Nelson Williams; Erin M.  Ramos; Carol Marie Hamilton

Identifying datasets for cross-study analysis in dbGaP using PhenX

Pan, H., Bakalov, V., Cox, L., Engle, M. L., Erickson, S. W., Feolo, M., Guo, Y., Huggins, W., Hwang, S., Kimura, M., Krzyzanowski, M., Levy, J., Phillips, M., Qin, Y., Williams, D., Ramos, E. M., & Hamilton, C. M. (2022). Identifying datasets for cross-study analysis in dbGaP using PhenX. Scientific data, 9(1), Article 532. https://doi.org/10.1038/s41597-022-01660-4

Copy citation

Abstract

Identifying relevant studies and harmonizing datasets are major hurdles for data reuse. Common Data Elements (CDEs) can help identify comparable study datasets and reduce the burden of retrospective data harmonization, but they have not been required, historically. The collaborative team at PhenX and dbGaP developed an approach to use PhenX variables as a set of CDEs to link phenotypic data and identify comparable studies in dbGaP. Variables were identified as either comparable or related, based on the data collection mode used to harmonize data across mapped datasets. We further added a CDE data field in the dbGaP data submission packet to indicate use of PhenX and annotate linkages in the future. Some 13,653 dbGaP variables from 521 studies were linked through PhenX variable mapping. These variable linkages have been made accessible for browsing and searching in the repository through dbGaP CDE-faceted search filter and the PhenX variable search tool. New features in dbGaP and PhenX enable investigators to identify variable linkages among dbGaP studies and reveal opportunities for cross-study analysis.

Publications Info

To contact an RTI author, request a report, or for additional information about publications by our experts, send us your request.

publications@rti.org

RTI shares its evidence-based research - through peer-reviewed publications and media - to ensure that it is accessible for others to build on, in line with our mission and scientific standards.

Meet the Experts

Navigate to Carol M. Hamilton

Carol M. Hamilton

Recent Publications

Article

US consumer and healthcare professional preferences for combination COVID-19 and influenza vaccines

December 31, 2025

Article

Plain language summary of mortality rates of patients with Parkinson’s disease psychosis who were treated either with pimavanserin or with different second-generation (atypical) antipsychotics

December 31, 2025

Article

A comprehensive GPS-based analysis of activity spaces in early and late pregnancy using the ActMAP framework

December 01, 2025

Article

IPECAD modeling workshop 2023 cross comparison challenge on cost-effectiveness models in Alzheimer's disease

April 01, 2025

View All Publications