Masoome Rezaei, M.Sc.
Icahn School of Medicine at Mount Sinai
"Background:
The increasing availability of genome sequencing data has led to a growing number of individuals identified as carriers of rare disease–associated variants. However, many carriers lack a confirmed clinical diagnosis, creating a gap between genetic findings and timely clinical action.
Objective:
To assess whether RarePT (Rare-Phenotype Prediction Transformer) can generate clinically meaningful risk scores for rare disease phenotypes among individuals carrying ClinVar variants linked to OMIM diseases, and to evaluate how these scores relate to observed case–control status in available phenotype data.
Methods:
We developed a reproducible workflow to identify carriers of ClinVar variants mapped to OMIM diseases corresponding to phenotypes of interest. Starting from curated HPO–OMIM mappings (92 phenotype labels across 91 OMIM diseases), we analyzed 9,603 ClinVar GRCh38 variants and identified carriers in multiple cohort-scale sequencing datasets, including 260,519 carriers in UK Biobank, 414,830 in All of Us, and 58,990 in the Mount Sinai Million cohort. Variant carriers were linked to disease-relevant phecodes, and RarePT risk scores were generated for the corresponding phenotypes to support downstream case–control comparisons.
Results:
Clinically diagnosed cases generally showed higher RarePT risk scores than carriers without documented disease. Notably, a subset of carriers lacking recorded diagnoses exhibited elevated predicted risk, consistent with known u"
Masoome Rezaei, M.Sc.