Webinar: Record Linkage of CODIS Profiles with SNP Genotypes
Date
Forensic-genetic work in the United States relies largely on the CODIS markers, a set of 20 (until recently, 13) microsatellite loci in heavy use since the 1990s. One premise that has influenced forensic practice—figuring in discussions of both backward compatibility of SNP-based systems with the CODIS database and of genetic privacy—is that the information provided by the CODIS loci is completely distinct from the information provided by larger sets of single-nucleotide polymorphisms (SNPs). Though the associations between CODIS markers and specific genetic variants known to influence phenotypes are low, there may still be a connection between CODIS records and SNP information if pairs of CODIS and SNP genotypes can be identified as coming from the same person—that is, if CODIS and SNP records can be linked.
A recently reported genetic record-linkage method assesses whether a particular set of genotypes from the CODIS markers is likely drawn from the same person (or an identical twin) as a set of genome-wide SNP genotypes (Edge, Algee-Hewitt, Pemberton, Li, & Rosenberg, 2017, PNAS). The method identifies matches with high accuracy in the presence of hundreds of false distractor matches, with implications for both the plausibility of backward compatibility of an SNP-based forensic database and for genetic privacy.
In this webinar, we will discuss the population-genetic basis of our method—linkage disequilibrium, or LD—and describe how LD can be leveraged to identify the closeness of a "match" between two sets of genotypes, even when the sets of genotypes share no markers in common. We will also describe some empirical demonstrations of the method in data from a worldwide sample of humans.
Finally, we will discuss the potential and limits of the method and outline some problems for future work.
Learning Objectives:
- Understand population-genetic sources of LD, or associations between markers at distinct loci
- Describe a way in which LD patterns can be used to identify pairs of genetic records that closely match, or do not match
- Predict the limits and potential of LD-based record matching as the number of CODIS loci increase