Be part of the knowledge.

We’re glad to see you’re enjoying ReachMD…
but how about a more personalized experience?

Register for free

Identification of 5 HF subtypes using a machine learning approach
Literature - Banerjee A, Dashtban A, Chen S, et al. - Lancet Digit Health. 2023 Jun;5(6):e370-e379. doi: 10.1016/S2589-7500(23)00065-1

Introduction and methods


Current HF subtype classifications have not resulted in precision medicine, personalized care, or targeted therapies [1-6]. Moreover, incomplete knowledge of HF subtypes across the wide spectrum of causal factors and populations has limited primary prevention and screening guidelines for this disease [7,8].

Aim of the study

In a large population of patients with incident HF, the authors used machine learning to (1) identify subtypes with clinical relevance throughout the HF disease course, and low risk of bias for patient selection and algorithms; (2) demonstrate internal, external, prognostic, and genetic validity; and (3) develop potential clinical pathways to improve impact.


In this external, prognostic, and genetic validation study, the authors used their 2021 framework for practical machine learning implementation consisting of 6 stages: clinical relevance, patients, algorithm, internal validation (within dataset and across methods), external validation (across methods), clinical utility, and effectiveness) [9]. Data of patients with incident HF aged ≥30 years were extracted from 2 population-based electronic health record databases in the UK, Clinical Practice Research Datalink (CPRD; n=188,800) and The Health Improvement Network (THIN; n=124,262), from 1998 to 2018.

The CPRD and THIN datasets yielded 645 factors before and after HF diagnosis, including demographic information, comorbidities, and medication use and persistence. For the algorithm, 87 of these 645 factors were selected. To reduce the risk of algorithmic bias, the following 4 unsupervised machine learning methods were compared: K-means, hierarchical, K-medoids, and mixture modeling.

Subtypes were identified and evaluated for: (1) external validity; (2) prognostic validity (predictive accuracy for 1-year all-cause mortality); and (3) genetic validity (associations with single nucleotide polymorphisms (SNPs) and polygenic risk scores (PRSs) for HF-related traits, using UK Biobank data (n=9573)).

To assess clinical utility, 5 HF clinicians were asked about clinical relevance, justification, and interpretability of the results. Based on their input, a model predicting cluster and survival was developed, as well as an HF cluster app for routine clinical use.

Main results

Internal and external validations and subtype identification

  • In the internal validation, the optimal number of clusters was 5. Based on demography, CVD risk factor burden, AF, CVD, medications, and laboratory factors, 5 clusters were identified, which were labelled as the following 5 HF subtypes: (1) early onset, (2) late onset, (3) AF-related, (4) metabolic, and (5) cardiometabolic.
  • In the external validation, subtypes were similar across datasets (for THIN model in CPRD, c-statistic ranged from 0.79 (subtype 3) to 0.94 (subtype 1) and for CPRD model in THIN ranged from 0.79 (subtype 1) to 0.92 (subtypes 2 and 5)).
  • Distribution of the 5 subtypes was similar across the CPRD and THIN datasets, with late onset (~33%) and cardiometabolic (~29%) being the most common subtypes and AF-related (~9%) being the least common subtype.

Prognostic validation

  • In the prognostic validation in CPRD using the THIN model, 1-year all-cause mortality after HF diagnosis was 0.20 (95%CI: 0.14–0.25) for subtype 1, 0.46 (95%CI: 0.43–0.49) for subtype 2, 0.61 (95%CI: 0.57–0.64) for subtype 3, 0.11 (95%CI: 0.07–0.16) for subtype 4, and 0.37 (95%CI: 0.32–0.41) for subtype 5. Between THIN and CPRD, differences in mortality were seen for clusters 1 and 5 but not for other clusters.
  • The risks of nonfatal CVD and all-cause hospitalization also varied by HF subtype.

Genetic validation

  • In the genetic validation, PRSs for atrial arrhythmias, DM, hypertension, MI, obesity, stable angina, and unstable angina were all associated with ≥1 HF subtype after correction for multiple testing (P<0.0009). The late onset and cardiometabolic subtypes broadly associated with similar PRSs.
  • Eight SNPs were nominally associated with predicted HF subtypes (P=0.049), of which 4 SNPs were limited to the AF-related subtype.

Clinical utility and effectiveness

  • The 5 HF clinicians reported the included factors and identified clusters had clinical relevance as per the authors’ 2021 framework.
  • The clinicians also felt the developed app reflected the identified HF subtypes and could enable testing of effectiveness and cost-effectiveness in appropriately designed, prospective studies.


Using their 6-stage framework for machine learning implementation, the authors identified 5 HF subtypes (early onset, late onset, AF-related, metabolic, and cardiometabolic) and validated these subtypes based on population-representative data. The 5 subtypes showed good predictive accuracy for 1-year all-cause mortality. To assess effectiveness of their approach, the authors also developed an open-access HF cluster app that clinicians can use to identify the cluster that fits a particular patient and their predicted survival.


1. Ponikowski P, Voors AA, Anker SD, et al. 2016 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure: The Task Force for the diagnosis and treatment of acute and chronic heart failure of the European Society of Cardiology (ESC)developed with the special contribution of the Heart Failure Association (HFA) of the ESC. Eur Heart J 2016; 37: 2129–200.

2. Mordi IR, Pearson ER, Palmer CNA, Doney ASF, Lang CC. Differential association of genetic risk of coronary artery disease with development of heart failure with reduced versus preserved ejection fraction. Circulation 2019; 139: 986–88.

3. Solomon SD, Pfeffer MA. The future of clinical trials in cardiovascular medicine. Circulation 2016; 133: 2662–70.

4. Seidelmann SB, Feofanova E, Yu B, et al. Genetic variants in SGLT1, glucose tolerance, and cardiometabolic risk. J Am Coll Cardiol 2018; 72: 1763–73.

5. Yancy CW, Jessup M, Bozkurt B, et al. 2017 ACC/AHA/HFSA focused update of the 2013 ACCF/AHA guideline for the management of heart failure: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Heart Failure Society of America. J Am Coll Cardiol 2017; 70: 776–803.

6. Chawla LS, Herzog CA, Costanzo MR, et al. Proposal for a functional classification system of heart failure in patients with end-stage renal disease: proceedings of the acute dialysis quality initiative (ADQI) XI workgroup. J Am Coll Cardiol 2014; 63: 1246–52.

7. Arnett DK, Blumenthal RS, Albert MA, et al. ACC/AHA guideline on the primary prevention of cardiovascular disease. Circulation 2019; 2019: CIR0000000000000678.

8. Banerjee A, Pasea L, Chung SC, et al. A population-based study of 92 clinically recognized risk factors for heart failure: co-occurrence, prognosis and preventive potential. Eur J Heart Fail 2022; 24: 466–80.

9. Banerjee A, Chen S, Fatemifar G, et al. Machine learning for subtype definition and risk prediction in heart failure, acute coronary syndromes and atrial fibrillation: systematic review of validity and clinical utility. BMC Med 2021; 19: 85.

Find this article online at Lancet Digit Health.Find here the heart failure cluster app

Facebook Comments

Schedule17 May 2024