Skip to main content

2019-4077

Research Proposal

Project Title: 
Evaluation of debiased machine learning methods to estimate the average treatment effect in observational studies.
Scientific Abstract: 

Background :
Machine learning [ML] methods can fit a wide class of functions and offers the possibility to work with high dimensional covariates. Recent approaches [Chernozhukov 2017] introduce asymptotically unbiased estimators relying on ML to model both the treatment and the outcome.
Objective :
Study the robustness of treatment effect [TE] estimators based on ML models and compare them to traditional estimators [Austin 2015, Glynn 2010].
Study Design :
A pool of clinical trials that share a common treatment (Canagliflozin) for one of the arm is used to compare the different estimators.
Participants :
We use the following trials including at least an arm treated by the Canagliflozin :
NCT00968812 NCT01032629 NCT01081834 NCT01106625 NCT01106651 NCT01106677 NCT01137812 NCT01989754 NCT02025907
Main Outcome Measure:
The study focus on TE on the change of HbA1C from baseline to week 26.
Statistical Analysis:
We compute in each setting the estimation, the related standard error and confidence intervals:
Assessing a treatment effect of zero between the Cana. arm of one trial and the Cana. arms of the remaining trials.
Given a Cana. arm, we add an artificial positive treatment $\tau$ to each patient outcome. We then assess a treatment effect of $\tau$ between the artificially modified Cana. arm and the remaining Cana. arms.
Given a trial, we replace the Cana. arm by the pool of Cana. arm from the remaining trials. The confidence intervals obtained using the different estimators is then compared to the CI obtained using OLS and the two arms of the random

Brief Project Background and Statement of Project Significance: 

There is a growing interest in complementing a single arm with historical data. Recently Amgen obtained breakthrough therapy designation from FDA, and conditional marketing authorization from EMA for Blincyto (blinatumomab) for the treatment of a rare form of leukemia using a pooled database of historical controls. The additional analysis requested by the EMA was performed using propensity score matching. Beyond approvals, regulatory agencies encouraged exploring the use of synthetic control arms. A white paper [DFH18] written by Medidata and FDA scientists was presented in a Friends of Cancer Research meeting in December 2018. This paper focused on the use of propensity score matching to reproduce the control arm. If estimators of treatment effect with machine learning are already used in econometrics [Chernozhukov 2017], this is not yet the case in healthcare. This kind of estimators could be used to get insight from observational data or inform better decision making in drug development by augmenting a single arm phase 2 with historical control.

Specific Aims of the Project: 

The objective of the project is to assess the performance of recently developed machine learning methods for treatment effect estimation [Chernozhukov 2017] and compare it to classical methods to estimate treatment effect in observational studies: propensity score matching, propensity score adjustment methods [Austin 2015, Robins 1995, Hahn 1998].

What is the purpose of the analysis being proposed? Please select all that apply.: 
Develop or refine statistical methods
Research on clinical trial methods
Research on comparison group
Research on clinical prediction or risk prediction
Software Used: 
R
Data Source and Inclusion/Exclusion Criteria to be used to define the patient sample for your study: 

Every patient participating to the pool of trials is a potential source of information. As the methods require training ML models, the more patients available, the better. We selected trials with non disjointed inclusion/exclusion criteria to ensure the positivity assumption.

Main Outcome Measure and how it will be categorized/defined for your study: 

HAb1c change from baseline to week 26, which is a shared outcome across the prespecified trials.

Main Predictor/Independent Variable and how it will be categorized/defined for your study: 

There is not one main predictor variable for the proposed study. Outcome and treatment models will be trained using machine learning on baseline characteristics that are shared among the clinical trials used.

Other Variables of Interest that will be used in your analysis and how they will be categorized/defined for your study: 

Baseline demographic and biologic variables will be used to train outcome and treatment models. Using those models, we will benchmark the different estimators of the average treatment effect.

Statistical Analysis Plan: 

A pool of clinical trials that share a common treatment (Canagliflozin) for one of the arm will be used to compare the different estimators. The outcome of interest to compute the treatment effect will be the change of HbA1C from baseline.
The evaluation of the estimators will be done in three stages:
* Experiment 1: Assessing a treatment effect of zero between the Canagliflozin arm of one trial and the Canagliflozin arms of the remaining trials.
* Experiment 2: Given a Canagliflozin arm, we add an artificial positive treatment $\tau$ to each patient outcome. We then assess a treatment effect of $\tau$ between the artificially modified Canagliflozin arm and the remaining Canagliflozin arms.
* Experiment 3: Given a trial, we replace the Canagliflozin arm by the pool of Canagliflozin arm from the remaining trials. The confidence intervals obtained using the different estimators is then compared to the confidence interval originally obtained using OLS and the two arms of the trials.

We will use the following trials including at least an arm treated by the Canagliflozin :
NCT00968812, NCT01032629, NCT01081834, NCT01106625, NCT01106651, NCT01106677, NCT01137812, NCT01989754, NCT02025907.

The study includes a comparison of the following estimators:
* Propensity score matching with treatment model trained using logistic regression and logit propensity score [Imai 2014];
* Inverse probability weighting with the treatment model trained using logistic regression and logit propensity score [Imai 2014];
* Double Machine Learning methods with sample-splitting and cross-fitting procedure [Chernozhukov 2017]. This method requires both outcome and treatment models. We will explore different methods for both models: random forest, gradient boosting models, SVM, logistic regression and linear regression.

Narrative Summary: 

The growing amount of data accumulated in the hospital EHR systems as well as in the past clinical trials represent a unique asset that can help to improve clinical decisions and to optimize drug development. Treatment effect estimation is one of the questions that can be addressed using various types of real world data, it remains, however, a challenging task due to biases induced by confounding factors as well as model regularization techniques when we use classical estimators. In this study, we intend to evaluate recently developed approaches in the econometrics literature based on Neyman orthogonal score functions using ML predictive [Chernozhukov 2017].

Project Timeline: 

Project start date: January 1, 2020
Analysis completion: April 1, 2020
Manuscript draft completion: May 1, 2020

Dissemination Plan: 

We plan on submitting this research as a research article in one of the following journal ‘Statistics in Medicine’ or ‘Statistical Methods in Medical Research’.

Bibliography: 

Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, and Whitney Newey. 2017. "Double/Debiased/Neyman Machine Learning of Treatment Effects." *American Economic Review* 107(5): 261--65. <https://doi.org/10.1257/aer.p20171038>.

Austin, Peter C., and Elizabeth A. Stuart. 2015. "Moving Towards Best Practice When Using Inverse Probability of Treatment Weighting (IPTW) Using the Propensity Score to Estimate Causal Treatment Effects in Observational Studies." *Statistics in Medicine* 34 (28): 3661--79. <https://doi.org/10.1002/sim.6607>

Glynn, Adam N., and Kevin M. Quinn. 2010. "An Introduction to the Augmented Inverse Propensity Weighted Estimator." *Political Analysis* 18 (1): 36--56. <https://doi.org/10.1093/pan/mpp036>.

Robins, J. and A. Rotnitzky (1995). Semi-parametric efficiency in multivariate regression models with
missing data. Journal of the American Statistical Association 90, 122–29.

Hahn, J. (1998). On the role of the propensity score in efficient semi-parametric estimation of average
treatment effects. Econometrica 66, 315–31.

Imai, K. and Ratkovic, M. (2014). Covariate balancing propensity score. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1):243{263.

Supplementary Material: 

General Information

How did you learn about the YODA Project?: 
Internet Search

Request Clinical Trials

Associated Trial(s): 
What type of data are you looking for?: 
Individual Participant-Level Data, which includes Full CSR and all supporting documentation

Data Request Status

Change the status of this request: 
Ongoing