Skip to main content


Research Proposal

Project Title: 
The effect of SGLT2 inhibitors in diabetes subgroups identified by data-driven clustering
Scientific Abstract: 

Background: Recently, researchers have identified novel 5 subgroups using a data-driven method, K-means clustering, with simple clinical parameters. However, how SGLT2i perform in these subgroups was unknown.
Objective: To investigate the effectiveness, safety, glucose lowing durability, cardiovascular outcomes and renal outcomes in these novel diabetes subgroups.
Study Design: We will use Kmeans cluster to form 4 groups including MARD, SIDD, SIRD and MOD in type 2 diabetes and compare the outcomes using SGLT2i in reference to placebo or other active between groups.
Participants: We will include the following participants: all participants attended all trials with complete baseline values for HbA1c, age, BMI, fasting blood glucose, fasting insulin (or Cpeptide) and attended at 26 weeks of the trial. 2) to investigate the role of SGLTi on cardiovascular outcomes and renal outcomes in the participants.
Main Outcome Measure:1) Primary outcome for efficacy assay is the change in HbA1c From Baseline to the 26th week. 2) outcome essay: 3 point MACE or progression of albuminuria
Statistical Analysis:
All subjects in the intent-to-treat (ITT) analysis set were included in the efficacy analyses. Participants were grouped into four cluster subgroups, ANCOVA model will be used to analyze the primary efficacy endpoint change in HbA1c (%). For outcome studies, the HR for 3P-MACE/ albuminuria will be estimated using a Cox proportional hazards model. The cumulative event rate over time will be presented using a Kaplan-Meier plot by groups.

Brief Project Background and Statement of Project Significance: 

Recently, Ahlqvist et, al. used six variables, including Glutamic acid decarboxylase antibody (GAD), age, body mass index (BMI), The Homeostasis Model Assessment-2 of beta cell function and insulin resistance (HOMA2-B and HOMA2IR) to identify five exclusive diabetes subgroups, using data-driven clustering method 1. When the information in Glutamic acid decarboxylase antibody (GADab) was absent, four cluster subgroups including mild age-related diabetes (MARD), mild obesity-related diabetes (MOD), severe insulin-deficient diabetes (SIDD) and severe insulin-resistant diabetes (SIRD), were repeated in Chinese and US populations2. The studies shown that the subgroups had distinct clinical characteristics and different trajectory towards diabetes complications. For example, SIRD subgroup had increased risk to develop cardiovascular disease and renal disorders and SIDD subgroup were prone to microvascular complications. This novel diabetes subgroups was reupdated as a new strategy towards precision management of lights on precision management of diabetes: subgrouping using simple clinical parameters at the baseline can predict the clinical outcomes of the patients 3. Future studies will now need to establish whether treatment response with different drug classes differs across these subtypes of diabetes and whether these drugs can change clinical outcomes 4.
Dennis et, al. used the data from the ADPOT trial to show that sulfonylureas (SUs) were suitable for MARD participants and thiazolidinediones (TZD) can bring benefit to SIRD patients 5. However, ADOPT study was finished by 2006 and many new anti-diabetes therapy were available in the last decade. The role of these therapy in the subgroups were unknown; also, whether these treatment can change the clinical outcome such as cardiovascular events were unknown.
sodium glucose cotransporter 2 inhibitor inhibitors (SGLT2i) are no doubt the most attracting new anti-diabetes reagent developed recently. SGLT2i can effectively reduce HbA1c with robust effect in body weight lowering and equal episodes hypoglycemia compared with placebo 6. EMPA-REG is the first randomized clinical trial (RCT) to show that an anti-diabetes drug can change the cardiovascular outcomes7. Later on, it was found Canagliflozin showed significant benefit in cardiovascular and renal outcomes in type 2 diabetes8,9. However, the cost of SGLT2i is relatively higher compared to conventional medications such as SU. Identifying the subgroups can directly benefit from SGLT2i can help reduce excessive healthcare cost. It is very important to identify the role of SGLT2i in these novel diabetes subgroups versus placebo, and other active drugs including SU, dipeptidyl peptidase-IV inhibitor (DDPIVi) or metformin.

Specific Aims of the Project: 

This study aims to figure out (1) compared to placebo and other oral anti-diabetic drugs (OADs) such as SU, metformin and DDPIVi , how SGLT2i perform in four cluster subgroups [MARD, SIDD, SIRD and MOD] in glucose lowering efficacy, safety and glucose lowing durability. (2) compare to placebo, whether SGLT2i can change the Cardiovascular/Renal outcomes in the four cluster subgroups.
(1) Finding the subgroup with best glycemic control, and least side effects of SGLT2i and other OADS.
(2) Post-hoc analysis of CANVAS data/CANVAS-R data to find whether there is any cluster difference in cardiovascular/Renal outcomes in SGLT2i versus placebo.

What is the purpose of the analysis being proposed? Please select all that apply.: 
Confirm or validate previously conducted research on treatment effectiveness
Research on clinical trial methods
Software Used: 
Data Source and Inclusion/Exclusion Criteria to be used to define the patient sample for your study: 

Studies that have been selected are randomized controlled trials of SGLT2i therapy in adult participants Type 2 diabetes. All selected studies have assessed HbA1c change over 26 weeks and compared to placebo or active drug, e.g. DPP4 inhibitor, metformin or sulfonylurea.

For the Efficacy assay, participants were polled together at individual level in the intent-to-treat analysis set. The safety analysis included all the participants received at least one dosage of treatment/placebo. The diabetes durability will be analyzed in those with complete treatment information at 26 weeks and end-of-trial (EOT). For the primary outcome essay: Canvas study. For renal outcomes: all individuals has eGFR and ACR information at baseline and at EOT.

Main inclusion criteria included: (1)taking medication for at least 26 weeks. (2)complete baseline information including age, BMI, HOMA2IR/HOMA2B (calculated from fasting insulin/C-peptide and fasting plasma glucose) and HbA1c.(3) negative GAD if applicable.

Main Outcome Measure and how it will be categorized/defined for your study: 

1.Efficacy study:
Main outcome: the decline in A1c in 26 week of treatment after baseline A1c was adjusted in the four subgroups.
Secondary outcomes include: (1)Hypoglycemic episodes across the trial period. (2)glucose lowering durability: difference between A1c at 26 week of treatment and the end of treatment(EOT).(3)change in beta cell function as assessed by HOMA2B or HOMAB from Baseline at EOT.
2.Outcome study: CV outcome: Major Adverse Cardiovascular Events (MACE) Composite of Cardiovascular (CV) Death, Non-Fatal Myocardial Infarction (MI), and Non-Fatal Stroke in the four cluster groups. Renal outcome: Progression of Albuminuria, either from normoalbuminuria to albuminuria or from microalbuminuria to macroalbuminuria.
Secondary outcomes: (1) Change from baseline in urinary albumin/creatinine ratio (ACR), (2)change from baseline in estimated glomerular filtration Rate (eGFR) at EOT.

Main Predictor/Independent Variable and how it will be categorized/defined for your study: 

The main predictor is the five baseline parameters including age, BMI, HOMAIR, HOMAB and HbA1c. Since the trials recruited type 2 diabetes and GADab was not available in most of the studies, we assume all participants were GAD negative. At baseline we will use these five parameters and K-means clustering to form four subgroups including SIDD, SIRD, MOD and MARD. Cluster characteristics will be analyzed to be able to match previous studies. All comparison will be made among subgroups.

Other Variables of Interest that will be used in your analysis and how they will be categorized/defined for your study: 

Other variable may contain: re-cluster at 3 months of trial if data were applicable to adjust the high blood glucose at the beginning of the trial.

Statistical Analysis Plan: 

(1) We first pool these trials together at an individual level and use Kmeans clustering to form four cluster subgroups using age, BMI, HbA1c, HOMA2B and HOMA2IR at baseline.
(2) We use ANCOVA model to analyze the decline in HbA1c in each subgroups in placebo, SU, DDPIVi and Canagliflozin. For safety analysis, we compare the percentage of hypoglycemia across the subgroups. ANCOVA model is also used to evaluate the difference between A1c at 26-week of treatment and at EOT. To adjust variance between trials, we may incorporate mixed effect model to predict the HbA1c decline (optional).
(3) For outcome study, we will use only CANVAS study/CANVAS-R to compare the cumulative incidence of cardiovascular disease or renal disease by cluster using Kaplan­Meier plots and Cox proportional hazard models with cluster as a categorical variable. We estimated R2 and the discrimination ability (Harrell’s C­index) of the cluster Cox model. Baseline cardiovascular characteristics and HbA1c may be added to adjusted in the model.
(4) Secondary endpoints include the decline in eGFR, increase of ACR was also assed by Cox model in any trials that contains baseline eGFR and ACR and these parameters at EOT (pooled data). Baseline eGFR and ACR may be further adjusted in these models.

Narrative Summary: 

Recently, researchers have identified 5 subgroups using a data-driven method, K-means clustering, with simple clinical parameters. This method was replicated in a variety of trials and populations and was regarded as a new strategy for precision management of diabetes participants. SGLT2i is a new generation diabetes therapy with benefit on cardiovascular and renal clinical outcomes. It is very important to identify the subgroup with optimal SGLT2i benefits. This is the first step towards precise therapy and can potentially save healthcare investment.

Project Timeline: 

Month 1 to 2: Database retrieval data organization.
Month 2-4: data analysis and making conclusions.
Month 4-6: article writing and submitting. Report results to the YODA project at the same time that the manuscript was submitted.
Month 6-12: Making corrections and possible reamendment for reviewer’s comments.

Dissemination Plan: 

We plan to submit a paper naming ’The efficacy, safety, and durability of SGLT inhibitors and whether they can change cardiovascular outcomes in novel diabetes clusters’. We aim to submit to Lancet diabetes & endocrinology, Diabetes care or JAMA internal Medicine.
Also, we will submit abstract to the ADA, EASD and IDF and present our results in these conferences.


1. Ahlqvist E, Storm P, Karajamaki A, et al. Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables. Lancet Diabetes Endocrinol 2018; 6(5): 361-9.
2. Zou X, Zhou X, Zhu Z, Ji L. Novel subgroups of patients with adult-onset diabetes in Chinese and US populations. Lancet Diabetes Endocrinol 2019; 7(1): 9-11.
3. Ahlqvist E, Tuomi T, Groop L. Clusters provide a better holistic view of type 2 diabetes than simple clinical features. The Lancet Diabetes & Endocrinology 2019; 7(9): 668-9.
4. Gloyn AL, Drucker DJ. Precision medicine in the management of type 2 diabetes. The Lancet Diabetes & Endocrinology 2018; 6(11): 891-900.
5. Dennis JM, Shields BM, Henley WE, Jones AG, Hattersley AT. Disease progression and treatment response in data-driven subgroups of type 2 diabetes compared with models based on simple clinical features: an analysis using clinical trial data. Lancet Diabetes Endocrinol 2019.
6. Zelniker TA, Wiviott SD, Raz I, et al. SGLT2 inhibitors for primary and secondary prevention of cardiovascular and renal outcomes in type 2 diabetes: a systematic review and meta-analysis of cardiovascular outcome trials. The Lancet 2019; 393(10166): 31-9.
7. Zinman B, Wanner C, Lachin JM, et al. Empagliflozin, cardiovascular outcomes, and mortality in type 2 diabetes. New England Journal of Medicine 2015; 373(22): 2117-28.
8. Neal B, Perkovic V, Mahaffey KW, et al. Canagliflozin and cardiovascular and renal events in type 2 diabetes. New England Journal of Medicine 2017; 377(7): 644-57.
9. Perkovic V, Jardine MJ, Neal B, et al. Canagliflozin and renal outcomes in type 2 diabetes and nephropathy. New England Journal of Medicine 2019; 380(24): 2295-306.

General Information

How did you learn about the YODA Project?: 
Internet Search

Request Clinical Trials

What type of data are you looking for?: 
Individual Participant-Level Data, which includes Full CSR and all supporting documentation

Data Request Status

Change the status of this request: