I make estimates for the predictive value of genetic screening for certain heart conditions, and how results can differ across ethnic groups. I also build tools to help others make similar analyses.
Overview and Motivation
A central goal of clinical genetics is to personalize medicine; that is, to look at differences in someone’s genome (i.e., genetic variants) and predict whether that person is more predisposed to certain diseases or more responsive to certain medications. While some genetic variants are very well understood (e.g., CAG repeats in Huntington’s disease), others remain highly inconclusive (e.g., certain point mutations in BRCA2). Because there is so much uncertainty, most genetic test reports do not contain risks scores (e.g., 95% chance of a later diagnosis). Instead, they typically report risk on a categorical scale–from benign (1) to pathogenic (5)–with questionable clinical utility. Right now, researchers are trying to aggregate these diverse claims into standard databases while culling out false claims. Among other goals, they are also working to develop standards for which studies to trust and which variants to report. Currently, no study has yet evaluated the clinical impact of applying these recently released guidelines to ethnically diverse populations.
Project 1: ACMG-59
There remains limited consensus on the clinical significance of secondary findings. Secondary findings are not related to any diagnostic indication (e.g. when screening healthy individuals/newborns) and are associated with an increased risk of false positives. To balance this risk, the American College on Medical Genetics and Genomics (ACMG) recommends that sequencing laboratories report secondary findings in a minimum list of 59 genes (ACMG-59).
1.2 Methods and Results
We want to investigate differential clinical impact across ethnic groups. To do this, we applied the ACMG-59 recommended reporting guidelines to 140,352 individuals from the gnomAD sequencing dataset. Specifically, we examined the frequency of pathogenic findings relative to empirical disease prevalences and inferred maximum penetrance bounds. Our findings indicate that the 95% upper penetrance bound for 4/6 cardiac phenotypes falls under 50%, with 2/6 under 10%. Moreover, most diseases have upper penetrance values that differ by more than 3-fold between ancestral groups.
Our model of penetrance provides a simple, reproducible, and quantitative framework for evaluating clinical utility, which implicates certain genetic tests as only weakly predictive.
We have encapsulated our analysis in a web tool that allows users to substitute their own estimates of population parameters and reproduce key figures.
1.3 Web Tool
Embedded from https://jamesdiao.shinyapps.io/penetrance_app/
It seems that further investigations on the uncertainty around secondary findings are needed to fully justify the important clinical decisions based on them.
Project 2: clinvaR
clinvaR is an R package for (1) generating an annotated list of variants for a given gene list, and (2) visualizing variant differences across ancestry groups and over time.
There are 3 main steps:
- Select a list of genes that you are interested in. clinvaR makes it easy to download and import relevant variants from 1KG.
- Select a version of ClinVar to assign pathogenicity labels with. All previous ClinVar VCFs are stored as binary files and can be retrieved with a “closest date” function.
- Select an analysis method.
2.2 Example Frequency Plots
2.3 Example Time-Series Plots
Check out the full vignette here