| Title: | Independent External Validation and Head-to-Head Comparison of Guideline-Recommended CVD Risk Prediction Models |
| Journal: | American Journal of Preventive Cardiology |
| Published: | 1 Apr 2026 |
| DOI: | https://doi.org/10.1016/j.ajpc.2026.101625 |
| Title: | Independent External Validation and Head-to-Head Comparison of Guideline-Recommended CVD Risk Prediction Models |
| Journal: | American Journal of Preventive Cardiology |
| Published: | 1 Apr 2026 |
| DOI: | https://doi.org/10.1016/j.ajpc.2026.101625 |
WARNING: the interactive features of this website use CSS3, which your browser does not support. To use the full features of this website, please update your browser.
Background Cardiovascular disease (CVD) prediction models recommended by guidelines are developed using different populations, predictors, and outcome definitions. The implications of this heterogeneity for risk estimation are unclear, and direct comparisons remain limited. Objectives: Head-to-head comparison of the performance and transportability of three guideline-endorsed CVD risk prediction models, focusing on their sex-specific performance. Methods: We evaluated models recommended by the American Heart Association (PREVENT), European Society of Cardiology (SCORE2), and the National Institute for Health and Care Excellence (QRISK3). Risk of bias was assessed using the PROBAST tool. External validation was performed using the UK Biobank (UKBB) in a primary analysis including all participants with complete data for all models, enabling direct comparison, and in a secondary analysis applying each model to participants meeting its original eligibility criteria. Model performance was assessed using Brier scores, Area Under the Receiver Operating Curve (AUC), and calibration across original and alternative outcome definitions, stratified by sex. Results: The PREVENT, SCORE2, and QRISK3 models varied substantially in terms of predictors, populations, and outcome definitions. We used data from 502,157 UKBB participants for the external validation in the primary analysis. Overall predictive performance (discrimination & calibration), as measured by Brier scores, was generally better in females. The AUC (95% CI) ranged from 0.7092 (0.7090-0.7094) to 0.7468 (0.7465-0.7471) for female and 0.6813 (0.6812-0.6814) to 0.6946 (0.6945-0.6946) for male populations. Calibration was suboptimal, particularly for older individuals, with systematic overestimation of risk. The models showed consistent performance when applied to different outcomes. All models were at high risk of bias. Conclusion: Despite heterogeneity in populations, predictors, and outcome definitions, PREVENT, SCORE2, and QRISK3 showed similar performance in the UKBB. Future studies should focus on prospective and standardized definitions and assessment of candidate predictors and outcomes.</p>
| Application ID | Title |
|---|---|
| 129200 | Towards a FemTech Digital Twin for Improving Women's Health |
Enabling scientific discoveries that improve human health