Abstract
Polygenic scores (PGSs) have shown promise in advancing precision medicine by capturing the additive effects of common genetic variants to assess inherited disease risk. However, their predictive accuracy remains limited in non-European populations. We enhanced our previously developed Bayesian polygenic model, "select and shrink with summary statistics" (S4), by introducing a multi-ancestry extension (S4-Multi) to improve prediction accuracy across African, American, East Asian, European, and South Asian ancestries. By leveraging simulated data and biobank cohorts from UK Biobank, FinnGen, Biobank Japan, the All of Us Research Program, and the Global Biobank Meta-Analysis Initiative, we benchmarked S4-Multi against leading methods for predicting type 2 diabetes, breast cancer, colorectal cancer, asthma, and stroke. In simulation tests, S4-Multi outperformed its single-ancestry version, achieving over 1.6 times greater accuracy in non-European populations, and matched or exceeded top-performing methods across all tested ancestry groups. In biobank tests, S4-Multi matched the performance of the best methods, varying by ancestry and phenotype. We find that S4-Multi achieves comparable performance using 9%-77% fewer genetic variants than competing models, highlighting potential for robust performance in clinical settings with limited available genomic data.</p>