2021-4850 - The YODA Project

                    array(45) {
  ["project_title"]=>
  string(138) "Cluster Analysis of Cardiovascular Phenotypes and SGLT2 Inhibition in Patients With Type 2 Diabetes and Established Cardiovascular Disease"
  ["project_narrative_summary"]=>
  string(668) "Diabetes is the 9th leading cause of death worldwide. Although it has been studied for several decades, it remains a strong risk factor for illnesses affecting the heart and kidneys, even when treated. Evidence suggests that this may be because several unknown types (or ?compositions?) of the disease exist. As a result, our study proposes using machine learning to uncover hidden compositions of the illness, which may have different prognoses and reactions to treatment.  These findings will help doctors provide personalized care by identifying patients who may need increased antidiabetic care, and likewise, patients who may benefit most from certain treatments."
  ["project_learn_source"]=>
  string(9) "colleague"
  ["project_learn_source_exp"]=>
  string(0) ""
  ["project_key_personnel"]=>
  array(5) {
    [0]=>
    array(6) {
      ["p_pers_f_name"]=>
      string(5) "Jiayi"
      ["p_pers_l_name"]=>
      string(2) "Ni"
      ["p_pers_degree"]=>
      string(13) "MA / MS / MSc"
      ["p_pers_pr_affil"]=>
      string(53) "Research Institute of McGill University Health Center"
      ["p_pers_scop_id"]=>
      string(0) ""
      ["requires_data_access"]=>
      string(2) "no"
    }
    [1]=>
    array(6) {
      ["p_pers_f_name"]=>
      string(6) "Thomas"
      ["p_pers_l_name"]=>
      string(10) "Mavrakanas"
      ["p_pers_degree"]=>
      string(20) "MD, MSc, FRCPC, FASN"
      ["p_pers_pr_affil"]=>
      string(17) "McGill University"
      ["p_pers_scop_id"]=>
      string(0) ""
      ["requires_data_access"]=>
      string(2) "no"
    }
    [2]=>
    array(6) {
      ["p_pers_f_name"]=>
      string(8) "Frederic"
      ["p_pers_l_name"]=>
      string(5) "Baroz"
      ["p_pers_degree"]=>
      string(2) "MD"
      ["p_pers_pr_affil"]=>
      string(11) "McGill MUHC"
      ["p_pers_scop_id"]=>
      string(0) ""
      ["requires_data_access"]=>
      string(2) "no"
    }
    [3]=>
    array(6) {
      ["p_pers_f_name"]=>
      string(8) "Philippe"
      ["p_pers_l_name"]=>
      string(7) "Boileau"
      ["p_pers_degree"]=>
      string(3) "PhD"
      ["p_pers_pr_affil"]=>
      string(17) "McGill University"
      ["p_pers_scop_id"]=>
      string(0) ""
      ["requires_data_access"]=>
      string(3) "yes"
    }
    [4]=>
    array(6) {
      ["p_pers_f_name"]=>
      string(6) "Wenxin"
      ["p_pers_l_name"]=>
      string(3) "Guo"
      ["p_pers_degree"]=>
      string(2) "MS"
      ["p_pers_pr_affil"]=>
      string(17) "McGill University"
      ["p_pers_scop_id"]=>
      string(0) ""
      ["requires_data_access"]=>
      string(3) "yes"
    }
  }
  ["project_ext_grants"]=>
  array(2) {
    ["value"]=>
    string(2) "no"
    ["label"]=>
    string(68) "No external grants or funds are being used to support this research."
  }
  ["project_funding_source"]=>
  string(0) ""
  ["project_assoc_trials"]=>
  array(3) {
    [0]=>
    object(WP_Post)#5549 (24) {
      ["ID"]=>
      int(1806)
      ["post_author"]=>
      string(4) "1363"
      ["post_date"]=>
      string(19) "2023-08-05 04:45:19"
      ["post_date_gmt"]=>
      string(19) "2023-08-05 04:45:19"
      ["post_content"]=>
      string(0) ""
      ["post_title"]=>
      string(195) "NCT01032629 - A Randomized, Multicenter, Double-Blind, Parallel, Placebo-Controlled Study of the Effects of JNJ-28431754 on Cardiovascular Outcomes in Adult Subjects With Type 2 Diabetes Mellitus"
      ["post_excerpt"]=>
      string(0) ""
      ["post_status"]=>
      string(7) "publish"
      ["comment_status"]=>
      string(6) "closed"
      ["ping_status"]=>
      string(6) "closed"
      ["post_password"]=>
      string(0) ""
      ["post_name"]=>
      string(189) "nct01032629-a-randomized-multicenter-double-blind-parallel-placebo-controlled-study-of-the-effects-of-jnj-28431754-on-cardiovascular-outcomes-in-adult-subjects-with-type-2-diabetes-mellitus"
      ["to_ping"]=>
      string(0) ""
      ["pinged"]=>
      string(0) ""
      ["post_modified"]=>
      string(19) "2025-05-13 14:18:55"
      ["post_modified_gmt"]=>
      string(19) "2025-05-13 18:18:55"
      ["post_content_filtered"]=>
      string(0) ""
      ["post_parent"]=>
      int(0)
      ["guid"]=>
      string(238) "https://dev-yoda.pantheonsite.io/clinical-trial/nct01032629-a-randomized-multicenter-double-blind-parallel-placebo-controlled-study-of-the-effects-of-jnj-28431754-on-cardiovascular-outcomes-in-adult-subjects-with-type-2-diabetes-mellitus/"
      ["menu_order"]=>
      int(0)
      ["post_type"]=>
      string(14) "clinical_trial"
      ["post_mime_type"]=>
      string(0) ""
      ["comment_count"]=>
      string(1) "0"
      ["filter"]=>
      string(3) "raw"
    }
    [1]=>
    object(WP_Post)#5551 (24) {
      ["ID"]=>
      int(1808)
      ["post_author"]=>
      string(4) "1363"
      ["post_date"]=>
      string(19) "2019-08-12 15:10:00"
      ["post_date_gmt"]=>
      string(19) "2019-08-12 15:10:00"
      ["post_content"]=>
      string(0) ""
      ["post_title"]=>
      string(188) "NCT01989754 - A Randomized, Multicenter, Double-Blind, Parallel, Placebo-Controlled Study of the Effects of Canagliflozin on Renal Endpoints in Adult Subjects With Type 2 Diabetes Mellitus"
      ["post_excerpt"]=>
      string(0) ""
      ["post_status"]=>
      string(7) "publish"
      ["comment_status"]=>
      string(6) "closed"
      ["ping_status"]=>
      string(6) "closed"
      ["post_password"]=>
      string(0) ""
      ["post_name"]=>
      string(182) "nct01989754-a-randomized-multicenter-double-blind-parallel-placebo-controlled-study-of-the-effects-of-canagliflozin-on-renal-endpoints-in-adult-subjects-with-type-2-diabetes-mellitus"
      ["to_ping"]=>
      string(0) ""
      ["pinged"]=>
      string(0) ""
      ["post_modified"]=>
      string(19) "2025-10-02 10:04:00"
      ["post_modified_gmt"]=>
      string(19) "2025-10-02 14:04:00"
      ["post_content_filtered"]=>
      string(0) ""
      ["post_parent"]=>
      int(0)
      ["guid"]=>
      string(231) "https://dev-yoda.pantheonsite.io/clinical-trial/nct01989754-a-randomized-multicenter-double-blind-parallel-placebo-controlled-study-of-the-effects-of-canagliflozin-on-renal-endpoints-in-adult-subjects-with-type-2-diabetes-mellitus/"
      ["menu_order"]=>
      int(0)
      ["post_type"]=>
      string(14) "clinical_trial"
      ["post_mime_type"]=>
      string(0) ""
      ["comment_count"]=>
      string(1) "0"
      ["filter"]=>
      string(3) "raw"
    }
    [2]=>
    object(WP_Post)#5550 (24) {
      ["ID"]=>
      int(1902)
      ["post_author"]=>
      string(4) "1363"
      ["post_date"]=>
      string(19) "2023-08-05 04:45:19"
      ["post_date_gmt"]=>
      string(19) "2023-08-05 04:45:19"
      ["post_content"]=>
      string(0) ""
      ["post_title"]=>
      string(229) "NCT02065791 - A Randomized, Double-blind, Event-driven, Placebo-controlled, Multicenter Study of the Effects of Canagliflozin on Renal and Cardiovascular Outcomes in Subjects With Type 2 Diabetes Mellitus and Diabetic Nephropathy"
      ["post_excerpt"]=>
      string(0) ""
      ["post_status"]=>
      string(7) "publish"
      ["comment_status"]=>
      string(6) "closed"
      ["ping_status"]=>
      string(6) "closed"
      ["post_password"]=>
      string(0) ""
      ["post_name"]=>
      string(194) "nct02065791-a-randomized-double-blind-event-driven-placebo-controlled-multicenter-study-of-the-effects-of-canagliflozin-on-renal-and-cardiovascular-outcomes-in-subjects-with-type-2-diabetes-mell"
      ["to_ping"]=>
      string(0) ""
      ["pinged"]=>
      string(0) ""
      ["post_modified"]=>
      string(19) "2025-10-28 13:13:55"
      ["post_modified_gmt"]=>
      string(19) "2025-10-28 17:13:55"
      ["post_content_filtered"]=>
      string(0) ""
      ["post_parent"]=>
      int(0)
      ["guid"]=>
      string(243) "https://dev-yoda.pantheonsite.io/clinical-trial/nct02065791-a-randomized-double-blind-event-driven-placebo-controlled-multicenter-study-of-the-effects-of-canagliflozin-on-renal-and-cardiovascular-outcomes-in-subjects-with-type-2-diabetes-mell/"
      ["menu_order"]=>
      int(0)
      ["post_type"]=>
      string(14) "clinical_trial"
      ["post_mime_type"]=>
      string(0) ""
      ["comment_count"]=>
      string(1) "0"
      ["filter"]=>
      string(3) "raw"
    }
  }
  ["project_date_type"]=>
  string(18) "full_crs_supp_docs"
  ["property_scientific_abstract"]=>
  string(1786) "Background In the CANVAS and CREDENCE trials,8,9 canagliflozin was associated with 33 and 39% reduction in the incidence of heart failure hospitalization, respectively. However, whether differential treatment effects exist with canagliflozin remains unclear.   

Objective As a result, this study aims to utilize race and latent class analysis to identify distinct clinical phenotypes in subjects with T2D and cardiovascular disease to elucidate potential differences in treatment effects across race and clinical phenotypes. 

Study Design Latent class analysis will be utilized to identify unobserved (or ?latent?) subclasses of individuals with T2D and cardiovascular disease. Cox proportional hazard regression models with interaction terms will be utilized to assess whether cluster membership is associated with a differential response to canagliflozin. Subgroup analyses according to race will also be done.

Participants The population of interest for the proposed analysis encompasses the entire patient populations enrolled in the CANVAS and CREDENCE trials. In analyses where differential survival according to cluster membership is evaluated, patients will be subdivided into their respective trial arms (i.e., canagliflozin, placebo). 

Main Outcomes The co-primary endpoints for the proposed analysis are time to heart failure hospitalization, and the composite of heart failure hospitalization or cardiovascular death. Secondary endpoints include cardiovascular death, nonfatal myocardial infarction, nonfatal stroke, cause specific mortality, all-cause mortality, and death from renal causes. 

Statistical Analysis We will assess proportional hazards assumptions, and survival models will be adjusted for baseline clinical characteristics."
  ["project_brief_bg"]=>
  string(2396) "Despite significant advances in the design of therapies for people with type 2 diabetes mellitus (T2DM), the disease continues to portend high rates of morbidity and mortality, even when traditional cardiovascular risk factors are well controlled.2 Like disease states such as heart failure and atherosclerosis, there is significant evidence that this may be because numerous pathophysiological phenotypes of the disease exist.3 Although diabetes has historically been diagnosed into two classes (i.e., type I and type II), these data suggest that our binary approach to treatment may not be sufficient for risk reduction. As a result, unsupervised learning (e.g., machine learning) algorithms such as latent class analyses have proliferated in the clinical literature to improve personalized care. 

As an alternative to traditional subgroup analysis, latent class analyses are a data agnostic method for elucidating distinct clinical phenotypes and their associated response to treatment.4 This method has been utilized successfully in a variety of medical disciplines, including heart failure with reduced or preserved ejection fraction.5,6 More recently, cluster analysis was utilized retrospectively in the Empaglifozin, Cardiovascular Outcome, and Mortality in T2DM (EMPA-REG outcomes) trial, identifying a nonsignificant trend of greater benefit in one of three phenotypes as characterized by lower rates of heart failure hospitalization in young people with low comorbidity burden.7 Despite these findings, some uncertainty persists as relatively few patients experienced these endpoints. 

Thus, we propose conducting a pooled latent class analysis in patients from the CANVAS (Canagliflozin Cardiovascular Assessment Study)8 and CREDENCE (Canagliflozin and Renal Events in Diabetes with Established Nephropathy Clinical Evaluation)9 trials to assess the differential effects of SGLT2 inhibition on cardiovascular and renal outcomes in patients with T2DM and established cardiovascular disease. Such an analysis could aid in the identification of phenotypes (or ?clusters?) of people with T2DM who may respond most beneficially to SGLT2 inhibition. This could help researchers identify novel patient groups for investigating new glycemic and cardiovascular treatments, potentially saving valuable resources, and allowing health care providers to offer greater access to care."
  ["project_specific_aims"]=>
  string(1191) "Primary Aim:

1.	To evaluate if distinct phenotypes of patients with type 2 diabetes can be identified in the CANVAS and CREDENCE trials, through hierarchical latent class analysis.

2.     To evaluate if the race of patients with type 2 diabetes delineates a unique clinical phenotype with distinct differential outcomes and treatment responses.  

Secondary Aims:

1.	To evaluate if these clusters have differential outcomes with regards to CANVAS and CREDENCE?s endpoints (i.e., composite of cardiovascular death, nonfatal myocardial infarction, or nonfatal stroke; heart failure hospitalization; cardiovascular death or heart failure hospitalization; cardiovascular death; all-cause mortality; and death from renal causes)?

2.	To evaluate if these clusters have differential treatment responses to canagliflozin with regards to CANVAS and CREDENCE?s endpoints (i.e., composite of cardiovascular death, nonfatal myocardial infarction, or nonfatal stroke; heart failure hospitalization; cardiovascular death or heart failure hospitalization; cardiovascular death; all-cause mortality; and death from renal causes).

Tertiary Aims:

1. To evaluate"
  ["project_study_design"]=>
  string(0) ""
  ["project_study_design_exp"]=>
  string(0) ""
  ["project_purposes"]=>
  array(0) {
  }
  ["project_purposes_exp"]=>
  string(0) ""
  ["project_software_used"]=>
  array(0) {
  }
  ["project_software_used_exp"]=>
  string(0) ""
  ["project_research_methods"]=>
  string(674) "For primary aim #1, which assesses whether distinct phenotypes exist, all patients enrolled in the CANVAS and CREDENCE trials will be eligible, and clusters will be stratified according to randomization. Race stratification will also be done for primary aim #2. 

For secondary aim #1, which assesses whether clusters have differential outcomes, all patients enrolled in the CANVAS and CREDENCE trials will be eligible, and clusters will be stratified according to randomization.

For secondary aim #2, which assesses whether clusters have differential treatment responses, only the patients enrolled in CANVAS and CREDENCE?s canagliflozin arm will be evaluated."
  ["project_main_outcome_measure"]=>
  string(431) "The co-primary endpoint of interest for our analysis will be time to heart failure hospitalization, and the composite of cardiovascular death or heart failure hospitalization as defined, respectively, in CANVAS and CREDENCE. 

The secondary endpoints of interest include nonfatal myocardial infarction, or nonfatal stroke; heart failure hospitalization; cardiovascular death; all-cause mortality; and death from renal causes."
  ["project_main_predictor_indep"]=>
  string(483) "The main predictor/independent variable for all analyses will be the clusters identified with the latent class analysis algorithm. As defined in the ?Statistical Analysis Plan? the number of clusters will be selected based on clinical significance (i.e., whether the clusters are clinically distinct in terms of baseline characteristics), the Bayesian information criterion, and the size of the smallest cluster. These criteria has been empirically shown to yield the best results.15"
  ["project_other_variables_interest"]=>
  string(1197) "Common variables/risk factors available across the CANVAS and CREDENCE trials will be utilized for the latent class analysis as well any regression models evaluating differential prognosis and treatment response. For the latent class analysis, all baseline variables/risk factors will be put through an oblique principal component analysis (PROC VARCLUS procedure in SAS)12 to normalize and reduce the dimensionality of the data to meet the presuppositions of latent class analysis.13 That is, the SAS procedure will aggregate all available variables into several non-overlapping clusters, which are defined by a summary score characterized by a linear combination of the variables from each patient. (The coefficients of the variable cluster summary score are identified by the first principal component of the variable cluster, which is to be done separately for continuous and categorical variables.) For Cox proportional hazards regression models, the variables of interest (i.e., baseline clinical, vital, and laboratory characteristics) will be unaltered and defined per the criteria in CANVAS and CREDENCE. Where non-overlapping definitions are observed, clinical experts will be consulted."
  ["project_stat_analysis_plan"]=>
  string(3888) "Clinical Variable Selection and Data Cleaning

Prior to beginning the analysis, baseline clinical variables will be jointly examined by two study investigators (AS, AR) to assess overlap between CANVAS and CREDENCE. In addition, we will evaluate and remove variables that are highly collinear and/or have limited clinical availability or relevance. We will also remove any variables that are coded as positive in less than 10% of cases as these have been shown to negatively affect patient clustering.13 Following initial variable assessment, multiple imputation (n=5) with the Markov chain Monte Carlo method will be performed if variables have a moderate proportion of missing data (e.g., 40% of data are missing) or if the ?missing not at random? assumption is plausible, only complete cases will be used.16   

Data Preparation for Latent Class Analysis

Following clinical variable selection and data cleaning, dimension reduction will be performed on the covariate list. This will be done through independent oblique principal component analysis with the SAS PROC VARCLUS procedure.12,17 In brief, the PROC VARCLUS procedure is an iterative variable clustering process that continuously divides a set of variables into disjoint clusters. That is, ?clusters that are as correlated as possible among themselves and as uncorrelated as possible with variables in other clusters?.12,17 This process will be applied to categorical and continuous variables independently. To determine the appropriate number of variable clusters, we will iteratively evaluate the second eigenvalue of each covariate group. A stopping rule (eigenvalue threshold of 0.7) will be used as suggested by Jackson and colleagues.18 Each patient?s covariate cluster will ultimately be defined by a normalized principal component summary score (i.e., a linear combination of variables) which will be used in the latent class analysis. Analogously, each patient will be defined by a matrix of covariate summary scores allowing for their clustering. 

Patient Clustering and Latent Class Analysis

Following data preparation, we will subsequently identify the latent clusters of individuals. This will be done utilizing the ?poLCA? package in R, which utilizes a latent class analysis algorithm.14 In brief, latent class analysis is a statistical model that classifies individuals into mutually exclusive (and exhaustive) clusters based on their observed set of measured characteristics. These clusters will be derived using a maximum likelihood estimation.  To avoid finding a local maximum of the log-likelihood function, the model will be estimated 10 times to automate the search for the global maximum. To derive the optimal number of clusters or subgroups, we will evaluate the first minima of the Bayesian information criteria, the size of the smallest class, and the clinical relevance of defined groups.15  We will use an a priori criteria of at least 200 patients per cluster to promote stability of effect estimates.

Inferential Statistics

We will first assess the association between cluster membership and clinical outcomes using Cox proportional hazards regression. Proportional hazards assumptions will be assessed by evaluating scaled Schoenfeld residuals. Kaplan-Meier estimated mortality and cause-specific event rates will be plotted according to treatment allocation. Using interaction terms in a Cox regression model, we will also evaluate whether cluster membership is associated with a differential response to randomized therapy. Results will be reported for clinical outcomes as hazard ratios (HR) and 95% confidence intervals (CI) for each cluster in comparison to a reference cluster and an overall p-value is provided to assess the relationship between cluster and outcome.  In models with interactions, an interaction p-value will be provided. P"
  ["project_timeline"]=>
  string(1214) "We anticipate completing all study milestones within approximately 8 months, as shown in detail in the Gantt chart (supplemental material). If provided the data, we would start the project on January 15th, 2022. All analyses, including consultations with cardiologists and nephrologists, would be completed in approximately the first 12 weeks. Then, we anticipate spending 4 weeks to write the initial draft of the manuscript and spending another 4 weeks inviting co-authors to help edit the text. The estimated time of completion of the finalized manuscript would thus be June 15th, 2022, at the latest. We would then develop a journal submission plan, which would include general and specialty medical journals including, but not limited, to the Journal of the American Medical Association (JAMA), the European Heart Journal (EHJ), Circulation, and the American Heart Journal. During this time, we would contemporaneously prepare abstracts for major conferences like the American College of Cardiology (ACC) and the Global Cardiovascular Clinical Trialists Forum (CVCT). Once the manuscript is accepted for publication, we would report our results to back to the YODA project by August 15th, 2022, at the latest."
  ["project_dissemination_plan"]=>
  string(897) "Our knowledge translation and dissemination plan is a critical aspect of our proposal as it will ensure that results impact health service delivery, clinical care, and future development of novel therapies. As described above, we plan to write manuscript(s) for peer-reviewed medical journals, which we will widely distribute through academic social media and national or international conferences. We would target general and specialty medical journals including, but not limited, to the Journal of the American Medical Association (JAMA), the European Heart Journal (EHJ), Circulation, the Canadian Journal of Cardiology, and the American Heart Journal. During this time, we would contemporaneously prepare abstracts for major scientific conferences such as the American College of Cardiology (ACC) Annual Scientific Session & Expo, and the Global Cardiovascular Clinical Trialists Forum (CVCT)."
  ["project_bibliography"]=>
  string(3533) "1. World Health Organization. The top 10 causes of death. Accessed June 23, 2021. https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death

2. Rawshani A, Rawshani A, Franzen S, et al. Risk Factors, Mortality, and Cardiovascular Outcomes in Patients with Type 2 Diabetes. New England Journal of Medicine. 2018;379(7):633-644. doi:10.1056/NEJMoa1800256

3. Gouda P, Zheng S, Peters T, et al. Clinical Phenotypes in Patients With Type 2 Diabetes Mellitus: Characteristics, Cardiovascular Outcomes and Treatment Strategies. Curr Heart Fail Rep. 2021;18(5):253-263. doi:10.1007/s11897-021-00527-w

4. Rindskopf D, Rindskopf W. The value of latent class analysis in medical diagnosis. Statistics in medicine. 1986;5(1):21-27.

5. Ahmad T, Pencina MJ, Schulte PJ, et al. Clinical implications of chronic heart failure phenotypes defined by cluster analysis. J Am Coll Cardiol. 2014;64(17):1765-1774. doi:10.1016/j.jacc.2014.07.979

6. Calfee CS, Delucchi K, Parsons PE, Thompson BT, Ware LB, Matthay MA. Latent Class Analysis of ARDS Subphenotypes: Analysis of Data From Two Randomized Controlled Trials. Lancet Respir Med. 2014;2(8):611-620. doi:10.1016/S2213-2600(14)70097-9

7. Sharma A, Ofstad AP, Ahmad T, et al. Patient Phenotypes and SGLT-2 Inhibition in Type 2 Diabetes. JACC: Heart Failure. 2021;9(8):568-577. doi:10.1016/j.jchf.2021.03.003

8. Neal B, Perkovic V, Mahaffey KW, et al. Canagliflozin and cardiovascular and renal events in type 2 diabetes. New England Journal of Medicine. 2017;377(7):644-657.

9. Perkovic V, Jardine MJ, Neal B, et al. Canagliflozin and Renal Outcomes in Type 2 Diabetes and Nephropathy. New England Journal of Medicine. 2019;380(24):2295-2306. doi:10.1056/NEJMoa1811744

10. Hagenaars JA, McCutcheon AL. Applied Latent Class Analysis. Cambridge University Press; 2002. Accessed October 8, 2021. http://ebookcentral.proquest.com/lib/mcgill/detail.action?docID=217833

11. Linda M. Collins, Stephanie T. Lanza. Latent Class and Latent Transition Analysis. 1st ed. John Wiley & Sons, Ltd; 2009. doi:10.1002/9780470567333

12. SAS Help Center: Overview: VARCLUS Procedure. Accessed October 12, 2021. https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.3/statug/statug_var…

13. Sinha P, Calfee CS, Delucchi KL. Practitioner?s Guide to Latent Class Analysis: Methodological Considerations and Common Pitfalls. Crit Care Med. 2021;49(1):e63-e79. doi:10.1097/CCM.0000000000004710

14. Linzer DA, Lewis JB. poLCA: An R package for polytomous variable latent class analysis. Journal of statistical software. 2011;42(1):1-29.

15. Nylund KL, Asparouhov T, Muthn BO. Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study. Structural Equation Modeling: A Multidisciplinary Journal. 2007;14(4):535-569. doi:10.1080/10705510701575396

16. Jakobsen JC, Gluud C, Wetterslev J, Winkel P. When and how should multiple imputation be used for handling missing data in randomised clinical trials ? a practical guide with flowcharts. BMC Med Res Methodol. 2017;17(1):1-10. doi:10.1186/s12874-017-0442-1

17. Nelson BD. Variable Reduction for Modeling using PROC VARCLUS. Data Analysis.:3.

18. Jackson JE, Edward A. User?s guide to principal components. John Willey Sons. Inc, New York. Published online 1991:40.

19. Clark SL, Muthn B. Relating Latent Class Analysis Results to Variables Not Included in the Analysis.; 2009.
"
  ["project_suppl_material"]=>
  bool(false)
  ["project_coi"]=>
  array(6) {
    [0]=>
    array(1) {
      ["file_coi"]=>
      array(21) {
        ["ID"]=>
        int(18634)
        ["id"]=>
        int(18634)
        ["title"]=>
        string(16) "yoda_coi_form_as"
        ["filename"]=>
        string(20) "yoda_coi_form_as.pdf"
        ["filesize"]=>
        int(157819)
        ["url"]=>
        string(69) "https://yoda.yale.edu/wp-content/uploads/2021/12/yoda_coi_form_as.pdf"
        ["link"]=>
        string(64) "https://yoda.yale.edu/data-request/2021-4850/yoda_coi_form_as-2/"
        ["alt"]=>
        string(0) ""
        ["author"]=>
        string(4) "1885"
        ["description"]=>
        string(0) ""
        ["caption"]=>
        string(0) ""
        ["name"]=>
        string(18) "yoda_coi_form_as-2"
        ["status"]=>
        string(7) "inherit"
        ["uploaded_to"]=>
        int(5677)
        ["date"]=>
        string(19) "2026-01-21 21:33:16"
        ["modified"]=>
        string(19) "2026-01-21 21:33:16"
        ["menu_order"]=>
        int(0)
        ["mime_type"]=>
        string(15) "application/pdf"
        ["type"]=>
        string(11) "application"
        ["subtype"]=>
        string(3) "pdf"
        ["icon"]=>
        string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
      }
    }
    [1]=>
    array(1) {
      ["file_coi"]=>
      array(21) {
        ["ID"]=>
        int(8838)
        ["id"]=>
        int(8838)
        ["title"]=>
        string(11) "coi_form_jn"
        ["filename"]=>
        string(15) "coi_form_jn.pdf"
        ["filesize"]=>
        int(19986)
        ["url"]=>
        string(64) "https://yoda.yale.edu/wp-content/uploads/2020/04/coi_form_jn.pdf"
        ["link"]=>
        string(57) "https://yoda.yale.edu/data-request/2021-4850/coi_form_jn/"
        ["alt"]=>
        string(0) ""
        ["author"]=>
        string(4) "1363"
        ["description"]=>
        string(0) ""
        ["caption"]=>
        string(0) ""
        ["name"]=>
        string(11) "coi_form_jn"
        ["status"]=>
        string(7) "inherit"
        ["uploaded_to"]=>
        int(5677)
        ["date"]=>
        string(19) "2023-07-31 15:12:27"
        ["modified"]=>
        string(19) "2023-08-01 00:59:10"
        ["menu_order"]=>
        int(0)
        ["mime_type"]=>
        string(15) "application/pdf"
        ["type"]=>
        string(11) "application"
        ["subtype"]=>
        string(3) "pdf"
        ["icon"]=>
        string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
      }
    }
    [2]=>
    array(1) {
      ["file_coi"]=>
      array(21) {
        ["ID"]=>
        int(13246)
        ["id"]=>
        int(13246)
        ["title"]=>
        string(11) "coi form FB"
        ["filename"]=>
        string(15) "coi-form-FB.pdf"
        ["filesize"]=>
        int(20193)
        ["url"]=>
        string(64) "https://yoda.yale.edu/wp-content/uploads/2019/12/coi-form-FB.pdf"
        ["link"]=>
        string(57) "https://yoda.yale.edu/data-request/2021-4850/coi-form-fb/"
        ["alt"]=>
        string(0) ""
        ["author"]=>
        string(2) "20"
        ["description"]=>
        string(0) ""
        ["caption"]=>
        string(0) ""
        ["name"]=>
        string(11) "coi-form-fb"
        ["status"]=>
        string(7) "inherit"
        ["uploaded_to"]=>
        int(5677)
        ["date"]=>
        string(19) "2023-08-22 21:04:35"
        ["modified"]=>
        string(19) "2023-08-22 21:05:04"
        ["menu_order"]=>
        int(0)
        ["mime_type"]=>
        string(15) "application/pdf"
        ["type"]=>
        string(11) "application"
        ["subtype"]=>
        string(3) "pdf"
        ["icon"]=>
        string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
      }
    }
    [3]=>
    array(1) {
      ["file_coi"]=>
      array(21) {
        ["ID"]=>
        int(17397)
        ["id"]=>
        int(17397)
        ["title"]=>
        string(11) "coi form PB"
        ["filename"]=>
        string(15) "coi-form-PB.pdf"
        ["filesize"]=>
        int(20142)
        ["url"]=>
        string(64) "https://yoda.yale.edu/wp-content/uploads/2021/12/coi-form-PB.pdf"
        ["link"]=>
        string(57) "https://yoda.yale.edu/data-request/2021-4850/coi-form-pb/"
        ["alt"]=>
        string(0) ""
        ["author"]=>
        string(4) "1885"
        ["description"]=>
        string(0) ""
        ["caption"]=>
        string(0) ""
        ["name"]=>
        string(11) "coi-form-pb"
        ["status"]=>
        string(7) "inherit"
        ["uploaded_to"]=>
        int(5677)
        ["date"]=>
        string(19) "2025-05-28 19:37:16"
        ["modified"]=>
        string(19) "2025-05-28 19:37:16"
        ["menu_order"]=>
        int(0)
        ["mime_type"]=>
        string(15) "application/pdf"
        ["type"]=>
        string(11) "application"
        ["subtype"]=>
        string(3) "pdf"
        ["icon"]=>
        string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
      }
    }
    [4]=>
    array(1) {
      ["file_coi"]=>
      array(21) {
        ["ID"]=>
        int(18635)
        ["id"]=>
        int(18635)
        ["title"]=>
        string(11) "COI FORM TM"
        ["filename"]=>
        string(15) "COI-FORM-TM.pdf"
        ["filesize"]=>
        int(36965)
        ["url"]=>
        string(64) "https://yoda.yale.edu/wp-content/uploads/2021/12/COI-FORM-TM.pdf"
        ["link"]=>
        string(57) "https://yoda.yale.edu/data-request/2021-4850/coi-form-tm/"
        ["alt"]=>
        string(0) ""
        ["author"]=>
        string(4) "1885"
        ["description"]=>
        string(0) ""
        ["caption"]=>
        string(0) ""
        ["name"]=>
        string(11) "coi-form-tm"
        ["status"]=>
        string(7) "inherit"
        ["uploaded_to"]=>
        int(5677)
        ["date"]=>
        string(19) "2026-01-21 21:33:49"
        ["modified"]=>
        string(19) "2026-01-21 21:33:49"
        ["menu_order"]=>
        int(0)
        ["mime_type"]=>
        string(15) "application/pdf"
        ["type"]=>
        string(11) "application"
        ["subtype"]=>
        string(3) "pdf"
        ["icon"]=>
        string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
      }
    }
    [5]=>
    array(1) {
      ["file_coi"]=>
      array(21) {
        ["ID"]=>
        int(18636)
        ["id"]=>
        int(18636)
        ["title"]=>
        string(11) "COI FORM WG"
        ["filename"]=>
        string(15) "COI-FORM-WG.pdf"
        ["filesize"]=>
        int(18409)
        ["url"]=>
        string(64) "https://yoda.yale.edu/wp-content/uploads/2021/12/COI-FORM-WG.pdf"
        ["link"]=>
        string(57) "https://yoda.yale.edu/data-request/2021-4850/coi-form-wg/"
        ["alt"]=>
        string(0) ""
        ["author"]=>
        string(4) "1885"
        ["description"]=>
        string(0) ""
        ["caption"]=>
        string(0) ""
        ["name"]=>
        string(11) "coi-form-wg"
        ["status"]=>
        string(7) "inherit"
        ["uploaded_to"]=>
        int(5677)
        ["date"]=>
        string(19) "2026-01-21 21:34:02"
        ["modified"]=>
        string(19) "2026-01-21 21:34:02"
        ["menu_order"]=>
        int(0)
        ["mime_type"]=>
        string(15) "application/pdf"
        ["type"]=>
        string(11) "application"
        ["subtype"]=>
        string(3) "pdf"
        ["icon"]=>
        string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
      }
    }
  }
  ["data_use_agreement_training"]=>
  bool(true)
  ["certification"]=>
  bool(true)
  ["project_send_email_updates"]=>
  bool(true)
  ["project_status"]=>
  string(9) "published"
  ["project_publ_available"]=>
  bool(true)
  ["project_year_access"]=>
  string(4) "2022"
  ["project_rep_publ"]=>
  array(1) {
    [0]=>
    array(2) {
      ["publication_link"]=>
      array(3) {
        ["title"]=>
        string(24) "Diabetes Obes Metab 2024"
        ["url"]=>
        string(41) "https://pubmed.ncbi.nlm.nih.gov/39301712/"
        ["target"]=>
        string(6) "_blank"
      }
      ["publication_doi"]=>
      string(17) "10.1111/dom.15768"
    }
  }
  ["project_assoc_data"]=>
  array(1) {
    [0]=>
    string(8) "data_res"
  }
  ["project_due_dil_assessment"]=>
  array(21) {
    ["ID"]=>
    int(13343)
    ["id"]=>
    int(13343)
    ["title"]=>
    string(51) "YODA Project Due Diligence Assessment 2021-4850 (2)"
    ["filename"]=>
    string(53) "YODA-Project-Due-Diligence-Assessment-2021-4850-2.pdf"
    ["filesize"]=>
    int(93189)
    ["url"]=>
    string(102) "https://yoda.yale.edu/wp-content/uploads/2023/08/YODA-Project-Due-Diligence-Assessment-2021-4850-2.pdf"
    ["link"]=>
    string(95) "https://yoda.yale.edu/data-request/2021-4850/yoda-project-due-diligence-assessment-2021-4850-2/"
    ["alt"]=>
    string(0) ""
    ["author"]=>
    string(2) "20"
    ["description"]=>
    string(0) ""
    ["caption"]=>
    string(0) ""
    ["name"]=>
    string(49) "yoda-project-due-diligence-assessment-2021-4850-2"
    ["status"]=>
    string(7) "inherit"
    ["uploaded_to"]=>
    int(5677)
    ["date"]=>
    string(19) "2023-08-30 17:10:09"
    ["modified"]=>
    string(19) "2023-08-30 17:10:19"
    ["menu_order"]=>
    int(0)
    ["mime_type"]=>
    string(15) "application/pdf"
    ["type"]=>
    string(11) "application"
    ["subtype"]=>
    string(3) "pdf"
    ["icon"]=>
    string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
  }
  ["project_title_link"]=>
  array(21) {
    ["ID"]=>
    int(18637)
    ["id"]=>
    int(18637)
    ["title"]=>
    string(42) "YODA Project Protocol 2021-4850 - 26-01-21"
    ["filename"]=>
    string(44) "YODA-Project-Protocol-2021-4850-26-01-21.pdf"
    ["filesize"]=>
    int(134803)
    ["url"]=>
    string(93) "https://yoda.yale.edu/wp-content/uploads/2021/12/YODA-Project-Protocol-2021-4850-26-01-21.pdf"
    ["link"]=>
    string(86) "https://yoda.yale.edu/data-request/2021-4850/yoda-project-protocol-2021-4850-26-01-21/"
    ["alt"]=>
    string(0) ""
    ["author"]=>
    string(4) "1885"
    ["description"]=>
    string(0) ""
    ["caption"]=>
    string(0) ""
    ["name"]=>
    string(40) "yoda-project-protocol-2021-4850-26-01-21"
    ["status"]=>
    string(7) "inherit"
    ["uploaded_to"]=>
    int(5677)
    ["date"]=>
    string(19) "2026-01-21 21:36:07"
    ["modified"]=>
    string(19) "2026-01-21 21:36:07"
    ["menu_order"]=>
    int(0)
    ["mime_type"]=>
    string(15) "application/pdf"
    ["type"]=>
    string(11) "application"
    ["subtype"]=>
    string(3) "pdf"
    ["icon"]=>
    string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
  }
  ["project_review_link"]=>
  array(21) {
    ["ID"]=>
    int(10802)
    ["id"]=>
    int(10802)
    ["title"]=>
    string(35) "yoda_project_review_-2021-4850_site"
    ["filename"]=>
    string(39) "yoda_project_review_-2021-4850_site.pdf"
    ["filesize"]=>
    int(1318275)
    ["url"]=>
    string(88) "https://yoda.yale.edu/wp-content/uploads/2023/08/yoda_project_review_-2021-4850_site.pdf"
    ["link"]=>
    string(81) "https://yoda.yale.edu/data-request/2021-4850/yoda_project_review_-2021-4850_site/"
    ["alt"]=>
    string(0) ""
    ["author"]=>
    string(4) "1363"
    ["description"]=>
    string(0) ""
    ["caption"]=>
    string(0) ""
    ["name"]=>
    string(35) "yoda_project_review_-2021-4850_site"
    ["status"]=>
    string(7) "inherit"
    ["uploaded_to"]=>
    int(5677)
    ["date"]=>
    string(19) "2023-08-09 17:12:38"
    ["modified"]=>
    string(19) "2023-08-09 19:13:30"
    ["menu_order"]=>
    int(0)
    ["mime_type"]=>
    string(15) "application/pdf"
    ["type"]=>
    string(11) "application"
    ["subtype"]=>
    string(3) "pdf"
    ["icon"]=>
    string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
  }
  ["project_highlight_button"]=>
  array(3) {
    ["title"]=>
    string(22) "Publication Available!"
    ["url"]=>
    string(41) "https://pubmed.ncbi.nlm.nih.gov/39301712/"
    ["target"]=>
    string(6) "_blank"
  }
  ["request_data_partner"]=>
  string(15) "johnson-johnson"
  ["search_order"]=>
  string(5) "-7990"
  ["principal_investigator"]=>
  array(7) {
    ["first_name"]=>
    string(7) "Abhinav"
    ["last_name"]=>
    string(6) "Sharma"
    ["degree"]=>
    string(7) "MD, PhD"
    ["primary_affiliation"]=>
    string(17) "McGill University"
    ["email"]=>
    string(29) "frederic.baroz@mail.mcgill.ca"
    ["state_or_province"]=>
    string(6) "Quebec"
    ["country"]=>
    string(6) "Canada"
  }
  ["human_research_protection_training"]=>
  bool(false)
  ["request_overridden_res"]=>
  string(1) "1"
}
data partner
array(1) {
  [0]=>
  string(15) "johnson-johnson"
}


pi country
array(1) {
  [0]=>
  string(6) "Canada"
}


pi affil
array(1) {
  [0]=>
  string(8) "Academia"
}


products
array(1) {
  [0]=>
  string(8) "invokana"
}


num of trials
array(1) {
  [0]=>
  string(1) "3"
}


res
array(1) {
  [0]=>
  string(1) "1"
}

General Information

How did you learn about the YODA Project?: Colleague

Conflict of Interest

Request Clinical Trials

Associated Trial(s):

What type of data are you looking for?: Individual Participant-Level Data, which includes Full CSR and all supporting documentation

Request Clinical Trials

Data Request Status

Status: Published

Research Proposal

Project Title: Cluster Analysis of Cardiovascular Phenotypes and SGLT2 Inhibition in Patients With Type 2 Diabetes and Established Cardiovascular Disease

Scientific Abstract: Background In the CANVAS and CREDENCE trials,8,9 canagliflozin was associated with 33 and 39% reduction in the incidence of heart failure hospitalization, respectively. However, whether differential treatment effects exist with canagliflozin remains unclear.
Objective As a result, this study aims to utilize race and latent class analysis to identify distinct clinical phenotypes in subjects with T2D and cardiovascular disease to elucidate potential differences in treatment effects across race and clinical phenotypes.
Study Design Latent class analysis will be utilized to identify unobserved (or ?latent?) subclasses of individuals with T2D and cardiovascular disease. Cox proportional hazard regression models with interaction terms will be utilized to assess whether cluster membership is associated with a differential response to canagliflozin. Subgroup analyses according to race will also be done.
Participants The population of interest for the proposed analysis encompasses the entire patient populations enrolled in the CANVAS and CREDENCE trials. In analyses where differential survival according to cluster membership is evaluated, patients will be subdivided into their respective trial arms (i.e., canagliflozin, placebo).
Main Outcomes The co-primary endpoints for the proposed analysis are time to heart failure hospitalization, and the composite of heart failure hospitalization or cardiovascular death. Secondary endpoints include cardiovascular death, nonfatal myocardial infarction, nonfatal stroke, cause specific mortality, all-cause mortality, and death from renal causes.
Statistical Analysis We will assess proportional hazards assumptions, and survival models will be adjusted for baseline clinical characteristics.

Brief Project Background and Statement of Project Significance: Despite significant advances in the design of therapies for people with type 2 diabetes mellitus (T2DM), the disease continues to portend high rates of morbidity and mortality, even when traditional cardiovascular risk factors are well controlled.2 Like disease states such as heart failure and atherosclerosis, there is significant evidence that this may be because numerous pathophysiological phenotypes of the disease exist.3 Although diabetes has historically been diagnosed into two classes (i.e., type I and type II), these data suggest that our binary approach to treatment may not be sufficient for risk reduction. As a result, unsupervised learning (e.g., machine learning) algorithms such as latent class analyses have proliferated in the clinical literature to improve personalized care.
As an alternative to traditional subgroup analysis, latent class analyses are a data agnostic method for elucidating distinct clinical phenotypes and their associated response to treatment.4 This method has been utilized successfully in a variety of medical disciplines, including heart failure with reduced or preserved ejection fraction.5,6 More recently, cluster analysis was utilized retrospectively in the Empaglifozin, Cardiovascular Outcome, and Mortality in T2DM (EMPA-REG outcomes) trial, identifying a nonsignificant trend of greater benefit in one of three phenotypes as characterized by lower rates of heart failure hospitalization in young people with low comorbidity burden.7 Despite these findings, some uncertainty persists as relatively few patients experienced these endpoints.
Thus, we propose conducting a pooled latent class analysis in patients from the CANVAS (Canagliflozin Cardiovascular Assessment Study)8 and CREDENCE (Canagliflozin and Renal Events in Diabetes with Established Nephropathy Clinical Evaluation)9 trials to assess the differential effects of SGLT2 inhibition on cardiovascular and renal outcomes in patients with T2DM and established cardiovascular disease. Such an analysis could aid in the identification of phenotypes (or ?clusters?) of people with T2DM who may respond most beneficially to SGLT2 inhibition. This could help researchers identify novel patient groups for investigating new glycemic and cardiovascular treatments, potentially saving valuable resources, and allowing health care providers to offer greater access to care.

Specific Aims of the Project: Primary Aim:
1. To evaluate if distinct phenotypes of patients with type 2 diabetes can be identified in the CANVAS and CREDENCE trials, through hierarchical latent class analysis.
2. To evaluate if the race of patients with type 2 diabetes delineates a unique clinical phenotype with distinct differential outcomes and treatment responses.
Secondary Aims:
1. To evaluate if these clusters have differential outcomes with regards to CANVAS and CREDENCE?s endpoints (i.e., composite of cardiovascular death, nonfatal myocardial infarction, or nonfatal stroke; heart failure hospitalization; cardiovascular death or heart failure hospitalization; cardiovascular death; all-cause mortality; and death from renal causes)?
2. To evaluate if these clusters have differential treatment responses to canagliflozin with regards to CANVAS and CREDENCE?s endpoints (i.e., composite of cardiovascular death, nonfatal myocardial infarction, or nonfatal stroke; heart failure hospitalization; cardiovascular death or heart failure hospitalization; cardiovascular death; all-cause mortality; and death from renal causes).
Tertiary Aims:
1. To evaluate

Study Design:

What is the purpose of the analysis being proposed? Please select all that apply.:

Software Used:

Data Source and Inclusion/Exclusion Criteria to be used to define the patient sample for your study: For primary aim #1, which assesses whether distinct phenotypes exist, all patients enrolled in the CANVAS and CREDENCE trials will be eligible, and clusters will be stratified according to randomization. Race stratification will also be done for primary aim #2.
For secondary aim #1, which assesses whether clusters have differential outcomes, all patients enrolled in the CANVAS and CREDENCE trials will be eligible, and clusters will be stratified according to randomization.
For secondary aim #2, which assesses whether clusters have differential treatment responses, only the patients enrolled in CANVAS and CREDENCE?s canagliflozin arm will be evaluated.

Primary and Secondary Outcome Measure(s) and how they will be categorized/defined for your study: The co-primary endpoint of interest for our analysis will be time to heart failure hospitalization, and the composite of cardiovascular death or heart failure hospitalization as defined, respectively, in CANVAS and CREDENCE.
The secondary endpoints of interest include nonfatal myocardial infarction, or nonfatal stroke; heart failure hospitalization; cardiovascular death; all-cause mortality; and death from renal causes.

Main Predictor/Independent Variable and how it will be categorized/defined for your study: The main predictor/independent variable for all analyses will be the clusters identified with the latent class analysis algorithm. As defined in the ?Statistical Analysis Plan? the number of clusters will be selected based on clinical significance (i.e., whether the clusters are clinically distinct in terms of baseline characteristics), the Bayesian information criterion, and the size of the smallest cluster. These criteria has been empirically shown to yield the best results.15

Other Variables of Interest that will be used in your analysis and how they will be categorized/defined for your study: Common variables/risk factors available across the CANVAS and CREDENCE trials will be utilized for the latent class analysis as well any regression models evaluating differential prognosis and treatment response. For the latent class analysis, all baseline variables/risk factors will be put through an oblique principal component analysis (PROC VARCLUS procedure in SAS)12 to normalize and reduce the dimensionality of the data to meet the presuppositions of latent class analysis.13 That is, the SAS procedure will aggregate all available variables into several non-overlapping clusters, which are defined by a summary score characterized by a linear combination of the variables from each patient. (The coefficients of the variable cluster summary score are identified by the first principal component of the variable cluster, which is to be done separately for continuous and categorical variables.) For Cox proportional hazards regression models, the variables of interest (i.e., baseline clinical, vital, and laboratory characteristics) will be unaltered and defined per the criteria in CANVAS and CREDENCE. Where non-overlapping definitions are observed, clinical experts will be consulted.

Statistical Analysis Plan: Clinical Variable Selection and Data Cleaning
Prior to beginning the analysis, baseline clinical variables will be jointly examined by two study investigators (AS, AR) to assess overlap between CANVAS and CREDENCE. In addition, we will evaluate and remove variables that are highly collinear and/or have limited clinical availability or relevance. We will also remove any variables that are coded as positive in less than 10% of cases as these have been shown to negatively affect patient clustering.13 Following initial variable assessment, multiple imputation (n=5) with the Markov chain Monte Carlo method will be performed if variables have a moderate proportion of missing data (e.g., 40% of data are missing) or if the ?missing not at random? assumption is plausible, only complete cases will be used.16
Data Preparation for Latent Class Analysis
Following clinical variable selection and data cleaning, dimension reduction will be performed on the covariate list. This will be done through independent oblique principal component analysis with the SAS PROC VARCLUS procedure.12,17 In brief, the PROC VARCLUS procedure is an iterative variable clustering process that continuously divides a set of variables into disjoint clusters. That is, ?clusters that are as correlated as possible among themselves and as uncorrelated as possible with variables in other clusters?.12,17 This process will be applied to categorical and continuous variables independently. To determine the appropriate number of variable clusters, we will iteratively evaluate the second eigenvalue of each covariate group. A stopping rule (eigenvalue threshold of 0.7) will be used as suggested by Jackson and colleagues.18 Each patient?s covariate cluster will ultimately be defined by a normalized principal component summary score (i.e., a linear combination of variables) which will be used in the latent class analysis. Analogously, each patient will be defined by a matrix of covariate summary scores allowing for their clustering.
Patient Clustering and Latent Class Analysis
Following data preparation, we will subsequently identify the latent clusters of individuals. This will be done utilizing the ?poLCA? package in R, which utilizes a latent class analysis algorithm.14 In brief, latent class analysis is a statistical model that classifies individuals into mutually exclusive (and exhaustive) clusters based on their observed set of measured characteristics. These clusters will be derived using a maximum likelihood estimation. To avoid finding a local maximum of the log-likelihood function, the model will be estimated 10 times to automate the search for the global maximum. To derive the optimal number of clusters or subgroups, we will evaluate the first minima of the Bayesian information criteria, the size of the smallest class, and the clinical relevance of defined groups.15 We will use an a priori criteria of at least 200 patients per cluster to promote stability of effect estimates.
Inferential Statistics
We will first assess the association between cluster membership and clinical outcomes using Cox proportional hazards regression. Proportional hazards assumptions will be assessed by evaluating scaled Schoenfeld residuals. Kaplan-Meier estimated mortality and cause-specific event rates will be plotted according to treatment allocation. Using interaction terms in a Cox regression model, we will also evaluate whether cluster membership is associated with a differential response to randomized therapy. Results will be reported for clinical outcomes as hazard ratios (HR) and 95% confidence intervals (CI) for each cluster in comparison to a reference cluster and an overall p-value is provided to assess the relationship between cluster and outcome. In models with interactions, an interaction p-value will be provided. P

Narrative Summary: Diabetes is the 9th leading cause of death worldwide. Although it has been studied for several decades, it remains a strong risk factor for illnesses affecting the heart and kidneys, even when treated. Evidence suggests that this may be because several unknown types (or ?compositions?) of the disease exist. As a result, our study proposes using machine learning to uncover hidden compositions of the illness, which may have different prognoses and reactions to treatment. These findings will help doctors provide personalized care by identifying patients who may need increased antidiabetic care, and likewise, patients who may benefit most from certain treatments.

Project Timeline: We anticipate completing all study milestones within approximately 8 months, as shown in detail in the Gantt chart (supplemental material). If provided the data, we would start the project on January 15th, 2022. All analyses, including consultations with cardiologists and nephrologists, would be completed in approximately the first 12 weeks. Then, we anticipate spending 4 weeks to write the initial draft of the manuscript and spending another 4 weeks inviting co-authors to help edit the text. The estimated time of completion of the finalized manuscript would thus be June 15th, 2022, at the latest. We would then develop a journal submission plan, which would include general and specialty medical journals including, but not limited, to the Journal of the American Medical Association (JAMA), the European Heart Journal (EHJ), Circulation, and the American Heart Journal. During this time, we would contemporaneously prepare abstracts for major conferences like the American College of Cardiology (ACC) and the Global Cardiovascular Clinical Trialists Forum (CVCT). Once the manuscript is accepted for publication, we would report our results to back to the YODA project by August 15th, 2022, at the latest.

Dissemination Plan: Our knowledge translation and dissemination plan is a critical aspect of our proposal as it will ensure that results impact health service delivery, clinical care, and future development of novel therapies. As described above, we plan to write manuscript(s) for peer-reviewed medical journals, which we will widely distribute through academic social media and national or international conferences. We would target general and specialty medical journals including, but not limited, to the Journal of the American Medical Association (JAMA), the European Heart Journal (EHJ), Circulation, the Canadian Journal of Cardiology, and the American Heart Journal. During this time, we would contemporaneously prepare abstracts for major scientific conferences such as the American College of Cardiology (ACC) Annual Scientific Session & Expo, and the Global Cardiovascular Clinical Trialists Forum (CVCT).

Bibliography:

1. World Health Organization. The top 10 causes of death. Accessed June 23, 2021. https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death
2. Rawshani A, Rawshani A, Franzen S, et al. Risk Factors, Mortality, and Cardiovascular Outcomes in Patients with Type 2 Diabetes. New England Journal of Medicine. 2018;379(7):633-644. doi:10.1056/NEJMoa1800256
3. Gouda P, Zheng S, Peters T, et al. Clinical Phenotypes in Patients With Type 2 Diabetes Mellitus: Characteristics, Cardiovascular Outcomes and Treatment Strategies. Curr Heart Fail Rep. 2021;18(5):253-263. doi:10.1007/s11897-021-00527-w
4. Rindskopf D, Rindskopf W. The value of latent class analysis in medical diagnosis. Statistics in medicine. 1986;5(1):21-27.
5. Ahmad T, Pencina MJ, Schulte PJ, et al. Clinical implications of chronic heart failure phenotypes defined by cluster analysis. J Am Coll Cardiol. 2014;64(17):1765-1774. doi:10.1016/j.jacc.2014.07.979
6. Calfee CS, Delucchi K, Parsons PE, Thompson BT, Ware LB, Matthay MA. Latent Class Analysis of ARDS Subphenotypes: Analysis of Data From Two Randomized Controlled Trials. Lancet Respir Med. 2014;2(8):611-620. doi:10.1016/S2213-2600(14)70097-9
7. Sharma A, Ofstad AP, Ahmad T, et al. Patient Phenotypes and SGLT-2 Inhibition in Type 2 Diabetes. JACC: Heart Failure. 2021;9(8):568-577. doi:10.1016/j.jchf.2021.03.003
8. Neal B, Perkovic V, Mahaffey KW, et al. Canagliflozin and cardiovascular and renal events in type 2 diabetes. New England Journal of Medicine. 2017;377(7):644-657.
9. Perkovic V, Jardine MJ, Neal B, et al. Canagliflozin and Renal Outcomes in Type 2 Diabetes and Nephropathy. New England Journal of Medicine. 2019;380(24):2295-2306. doi:10.1056/NEJMoa1811744
10. Hagenaars JA, McCutcheon AL. Applied Latent Class Analysis. Cambridge University Press; 2002. Accessed October 8, 2021. http://ebookcentral.proquest.com/lib/mcgill/detail.action?docID=217833
11. Linda M. Collins, Stephanie T. Lanza. Latent Class and Latent Transition Analysis. 1st ed. John Wiley & Sons, Ltd; 2009. doi:10.1002/9780470567333
12. SAS Help Center: Overview: VARCLUS Procedure. Accessed October 12, 2021. https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.3/statug/statug_var…
13. Sinha P, Calfee CS, Delucchi KL. Practitioner?s Guide to Latent Class Analysis: Methodological Considerations and Common Pitfalls. Crit Care Med. 2021;49(1):e63-e79. doi:10.1097/CCM.0000000000004710
14. Linzer DA, Lewis JB. poLCA: An R package for polytomous variable latent class analysis. Journal of statistical software. 2011;42(1):1-29.
15. Nylund KL, Asparouhov T, Muthn BO. Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study. Structural Equation Modeling: A Multidisciplinary Journal. 2007;14(4):535-569. doi:10.1080/10705510701575396
16. Jakobsen JC, Gluud C, Wetterslev J, Winkel P. When and how should multiple imputation be used for handling missing data in randomised clinical trials ? a practical guide with flowcharts. BMC Med Res Methodol. 2017;17(1):1-10. doi:10.1186/s12874-017-0442-1
17. Nelson BD. Variable Reduction for Modeling using PROC VARCLUS. Data Analysis.:3.
18. Jackson JE, Edward A. User?s guide to principal components. John Willey Sons. Inc, New York. Published online 1991:40.
19. Clark SL, Muthn B. Relating Latent Class Analysis Results to Variables Not Included in the Analysis.; 2009.