Skip to main content


Research Proposal

Project Title: 
Policy-aware evaluation of personalized treatment strategies
Scientific Abstract: 

Background: Personalized or precision medicine aims at giving the “right treatment to the right patient”. It is one of the most promising areas of medical research, but its development is hindered by methodological limitations of studies. Determining individualized treatment rules (ITR) is an active research field of biostatistics relying on recent statistical methods (machine learning, classification of high-dimensional data) that raises many statistical and computational challenges. For instance measures of the benefit of an ITR as compared to a ‘one size fits all’ treatment strategy where all patients receive the treatment performing best on average have been proposed. Properly testing whether personalization provides overall benefit after estimating an ITR however remains an open problem.
Objective: We aim at developing a statistical test for the benefit of personalization, as well as innovative approaches to identify ITRs, and to apply these methods to real data.
Study design: Retrospective analysis of randomized controlled trial.
Participants: Diabetes patients included in the trial and receiving at least one dose of study drug.
Main Outcome measure: Change in HbA1c from baseline to week 52.
Statistical analysis: The estimation of ITR will rely on modeling the outcome using treatment arm and covariates using random forests. The estimating and test of the benefit of the ITR will account for the uncertainty in the ITR estimation, and provide proper confidence intervals as well as control of the type I error rate.

Brief Project Background and Statement of Project Significance: 

The objective of personalized or precision medicine is to give “the right patient the right drug at the right moment”. Precision medicine therefore implies determining which treatment is the best for a given patient, based on his/her characteristics, instead of favoring the one with better outcome on average in the whole population (one size fits all). Indeed, the current practice in medicine is to favor the treatment with the highest response rate (response being intended in the general sense of any favorable outcome). This would be a reasonable rule only if the responders to the “inferior” treatment would all respond to the “superior” one. However, when sets of responders do not overlap, an individualized treatment strategy could lead to a much higher response rate in the overall population. For instance if the usual treatment strategy only has a 20% response rate and 40% of patients respond to the new treatment, the response rate in the overall population could range from 40% (if all responders to the usual treatment respond to the new one) to 60% (if none of them respond to the new treatment).
Developing methods to identify patients more likely to respond to a treatment than to another one has recently become a very active research topic in biostatistics. Once a model predicting whether a patient would be more likely to respond to a given treatment or to its comparator has been obtained, for example using data from a randomized controlled trial (RCT), it is straightforward to derive an individualized treatment rule (ITR) where patients would receive the treatment under which their predicted response is higher. It has been shown that such a strategy would maximize the expectation of the outcome over the population.
Measures of the performance of an ITR using a biomarker as compared to the “one size fits all” strategy have also been developed, such as the improvement in population average outcome under the ITR, for instance. The specification of the model to estimate the performance of the ITR is however important, especially when several biomarkers are considered together. The classical approach relies on generalized linear regression with markers by treatment interactions, but the effect of model misspecification can be critical, especially around the decision boundary. This would lead to biased predicted performance of ITR. Even using flexible approaches such as machine learning, there remains a non-negligible risk to recommend the incorrect treatment for patients with close predicted response under each treatment. In addition, testing whether personalization provides overall benefit remains an open problem.

Statement of project significance
Our work is important for two main reasons. First, it is crucial that responders to each treatment compared would be correctly identified to develop individualized treatment strategies that will improve the outcome of patients. In that respect, there is a need for cutting-edge statistical methods. Second, it is also important that the benefit of such individualized strategies would not be overstated and overestimated because there is a risk of false decision at potentially high costs.

Specific Aims of the Project: 

The aims of this project are:
(1) To develop statistical tests for the benefit of personalization.
(2) To develop innovating approaches to determine individualized treatment strategies using data from randomized controlled trials.
(3) To illustrate the potential gain of the approach we propose using real data from randomized controlled trials.
The request for data to the Yoda platform primarily serves this third aim. As a corollary, this will allow ultimately to determine individualized treatment strategies for diabetic patients with an associated measure of potential population benefit.

What is the purpose of the analysis being proposed? Please select all that apply.: 
New research question to examine treatment effectiveness on secondary endpoints and/or within subgroup populations
Preliminary research to be used as part of a grant proposal
Data Source and Inclusion/Exclusion Criteria to be used to define the patient sample for your study: 

Selection of the trials:
The methods we develop need RCTs with relatively large sample size, as well as relevant patients characteristics. Based on these considerations, we have selected the study NCT00968812 (CANTATA-SU trial) as a possible good candidate for our methodology, since it has a large sample size, the experimental and control treatments have different mechanisms of action, which is likely better suited for finding variables associated with a differential treatment effect, and the effect of the experimental treatment as compared to the comparator is not overwhelming, thus allowing for a more refined strategy.

Selection of the patients:
All patients included in the selected trial and receiving at least one dose of study drug (modified intent to treat analysis as reported in the study primary reports) will be considered.

Main Outcome Measure and how it will be categorized/defined for your study: 

Our main outcome will be the same as the primary study outcomes, i.e. change in HbA1c from baseline to week 52.

To illustrate the potential of the method for a binary outcome, which has been more frequent in statistical articles on the issue of individualized treatment strategies, we will add a binary key secondary outcome, which will be the proportion of patients achieving HbA1C <7.0% (53 mmol/mol).

Main Predictor/Independent Variable and how it will be categorized/defined for your study: 

The primary independent variable is the treatment arm allocated. Since the CANTATA-SU comprises three treatment arms, we will use as primary comparison the comparison of canagliflozin 300 mg + metformin versus glimepiride + metformin. In a second stage, we will perform a similar analysis for canagliflozin 100 mg + metformin versus glimepiride + metformin.

Other Variables of Interest that will be used in your analysis and how they will be categorized/defined for your study: 

The other variables of interest are the patient characteristics that will be used to construct a model for individual treatment. In order to capture at best the heterogeneity in treatment effect, as many baseline (pre-randomization) variables as possible should be considered in our models. We list here a minimal set of variables that could be used:
Glycated hemoglobin A1c (HbA1c)
Fasting plasma glucose (FPG)
Body-mass index
Duration of type 2 diabetes
Whether patient entered antihyperglycemic drug adjustment period
Smoking history
Other diseases or comorbidities (whenever available)
Systolic blood pressure
Diastolic blood pressure
Pulse rate
LDL cholesterol
HDL cholesterol
Non-HDL cholesterol
Alanine aminotransferase
Aspartate aminotransferase
Alkaline phosphatase
Blood urea nitrogen
Urine albumin/creatinine
Total fat mass
Total lean mass
Subcutaneous adipose tissue
Visceral adipose tissue

Statistical Analysis Plan: 

The analysis will consider a counterfactual outcomes framework, where we posit that for each patient, there exists two potential outcomes, Y(1) and Y(0), representing the outcome that the patient would experience should s/he receive the studied treatment (indexed by 1) or its comparator (indexed by 0), respectively. This allows defining a counterfactual individual treatment effect as D = Y(1) – Y(0). In practice, D cannot be observed, except under very specific trial designs such as n-of-1 trials. The question of precision medicine is thus rather to estimate the expected value of D given a set of covariates X. Assuming that higher values of Y represent a more favorable outcome, it has been shown that the optimal individualized treatment rule given X—or optimal treatment regime—corresponds to give treatment 1 to patients with D(X)=E(D|X) > 0 and treatment 0 to patients with D(X) < 0. For those with D(X)=0, the decision to favor one of the treatment should be based on other considerations, such as favoring the treatment with higher outcome on average.
Let us assume that the new treatment (here canaglifozin) performs better than its comparator. Deriving an individualized treatment rule (ITR) therefore relies on identifying the set of patients for whom the predicted treatment effect is negative, after modeling the outcome under each treatment. In our methodological work, we however show the poor statistical properties of such a policy in practice, with a high risk of identifying patients as “non-responders” when in fact they derive benefit from the treatment. We therefore intend to use an approach that we have developed and termed ‘policy-aware’ to the analysis of data, that we have shown to outperform the classical approach.
Let us introduce some notations, to facilitate the description of our approach. An ITR or policy p, maps the vector of patients characteristics X to {0,1}, representing the treatment, so that p(X) is either 1 (canaglifozin should be given) or 0 (glimepiride should be given). The population benefit of using the policy p as compared to giving canaglifozin to all is K(p) = E(Y|p is used) – E(Y|all receive canaglifozin) = E[-D(X) | p(X)=0], where D(X) is E(D|X). In practice, however, D(X) is unknown, and is estimated from the data as d(X)=Ê(Y(1) – Y(0)|X).

The analysis will consist of the following steps:
1. Develop a model for the outcome using the treatment arm and covariates using random forests, in order to allow more flexibility in the model.
2. Derive features from the random forests that are subsequently used as a single covariate in a (generalized) linear model for the outcome with effects for the feature, the treatment and their interaction.
3. Randomly sample 1000 times in the posterior distribution of the model parameters to derive a vector of subject-specific treatment effects d(X) predicted by injecting the sampled parameters into the regression model, and convert this vector into a z-score by taking the mean divided by the standard deviation, resulting in a z(X).
4. Determine the threshold r which maximizes the expected population benefit of personalization K(p) when only those for with z(X) > r receive canaglifozin, by searching a grid from -2.05 (0.02-quantile of a standard normal distribution) to 0.
5. Provide estimates of the expected population benefit of this policy p obtained with the optimized value of r, that we term ‘max lower bound policy’, with associated confidence interval.

When analyzing changes in HbA1c, a linear model will be used, but for analysis of the proportion of patients achieving HbA1c < 7%, then we will rely on a logistic model.

In this project, missing outcome and predictor values will be simply ignored for the analyses that only aim at illustrating the potential of the method to a statistical audience. On the contrary, for the real clinical application, they will be handled through multiple imputation by chained equations.

Narrative Summary: 

In therapeutic evaluation, the treatment with the highest response rate (or average outcome) is usually considered as superior to the others. This would however be true only if the responders to the “inferior” treatment would all respond to the “superior” one. When sets of responders do not overlap, an optimal treatment strategy could lead to a much higher response rate in the overall population
In this project, we aim at developing statistical methods to both identify individualized treatment rules targeting the responders to each treatment, and evaluate the population benefit of using such rules.

Project Timeline: 

We are currently working on the methodological developments and performing simulation studies to investigate the properties of our procedure in realistic settings. Analyzing the trial should be straightforward once they are in an analysis-ready format. Depending on the format of data provided, however, this could imply additional data management tasks. We however plan to have the analyses ready in 6 to 8 months.

Dissemination Plan: 

Our primary purpose is to illustrate how the methods we develop perform in real settings. To this aim, we plan to draft a first article for a statistical journal such as JASA or Biometrics, where the data would serve as illustration only.
Then we plan to draft also a clinical article for a medical audience (in a journal such as BMJ, PloS Medicine, BMC Medicine, or a specialty journal such as Diabetes or Diabetes Care), where the results of the study would be presented for non-statisticians, expecting a clinical impact of our project.


Cai T, Tian L, Wong PH, Wei LJ. Analysis of randomized comparative clinical trial data for personalized treatment selections. Biostatistics 2011; 12:270–282.
Foster JC, Taylor JM, Ruberg SJ. Subgroup identification from randomized clinical trial data. Stat Med 2011; 30:2867–2880.
Huang EJ, Fang EX, Hanley DF, Rosenblum M. Inequality in treatment benefits: can we determine if a new treatment benefits the many or the few? Biostatistics 2017; 18(2):308–324.
Huang Y, Fong Y. Identifying optimal biomarker combinations for treatment selection via a robust kernel method. Biometrics 2014; 70(4):891–901.
Huang Y, Laber EB, Janes H. Characterizing expected benefits of biomarkers in treatment selection. Biostatistics. 2015;16(2):383–99.
Janes H, Brown MD, Huang Y, Pepe MS. An approach to evaluating and comparing biomarkers for patient treatment selection. Int J Biostat 2014; 10(1):99–121.
Janes H, Pepe MS, McShane LM et al. The fundamental difficulty with evaluating the accuracy of biomarkers for guiding treatment. J Natl Cancer Inst 2015; 107(8):djv157.
Kang C, Janes H, Huang Y. Combining biomarkers to optimize patient treatment recommendations. Biometrics 2014; 70(3):695–707.
Li J, Zhao L, Tian L, Cai T, Claggett B, Callegaro A, Dizier B, Spiessens B, Ulloa-Montoya F, Wei LJ. A predictive enrichment procedure to identify potential responders to a new therapy for randomized, comparative controlled clinical studies. Biometrics. 2016; 72(3):877–887.
Lipkovich I, Dmitrienko A, Denne J, Enas G. Subgroup identification based on differential effect search - a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat Med 2011; 30:2601–2621.
Porcher R, Jacot J, Biau D. Identifying treatment responders using counterfactual modeling and potential outcomes. Presented at EpiClin 2016, manuscript submitted.
Qian M, Murphy SA. Performance guarantees for individualized treatment rules. Ann Stat 2011; 39(2):1180–1210.
Shalit U, Johansson F, Sontag D. Estimating individual treatment effect: generalization bounds and algorithms. arXiv:1606.03976; 2016.
Shen J, Wang L, Taylor JMG. Estimation of the optimal regime in treatment of prostate cancer recurrence from observational data using flexible weighting models. Biometrics 2017;73(2):635–645.
Shen J, Wang L, Daignault S, Spratt DE, Morgan TM, Taylor JMG. Estimating the optimal personalized treatment strategy based on selected variables to prolong survival via random survival forest with weighted bootstrap. J Biopharm Stat 2017 (Ahead of print).
Su X, Tsai CL, Wang H, Nickerson DM, Li B. Subgroup analysis via recursive partitioning. J Mach Learn Res 2009; 10:141–158.
Zhang B, Tsiatis AA, Laber EB, Davidian M. A robust method for estimating optimal treatment regimes. Biometrics 2012; 68:1010–1018.
Zhao L, Tian L, Cai T, Claggett B, Wei LJ. Effectively selecting a target population for a future comparative study. J Am Stat Assoc 2013;108:527–539.
Zhao YQ, Zeng D, Laber EB et al. Doubly robust learning for estimating individualized treatment with censored data. Biometrika 2015; 102(1):151–168.
Wager S, Athey S. Estimation and inference of heterogeneous treatment effects using random forests. J Am Stat Assoc. 2017 (ahead of print).

General Information

How did you learn about the YODA Project?: 

Request Clinical Trials

What type of data are you looking for?: 
Individual Participant-Level Data, which includes Full CSR and all supporting documentation