Background: Using baseline patient characteristics, our Personalized Advantage Index (PAI) modeling allows us to calculate scores to indicate if patients would respond to an antidepressant or placebo. Our novel approach of adding explainable artificial intelligence (XAI) allows us to generate lists of if-then rules that take into account predictive features to allow for physician understanding and enrichment of clinical trials with patient-level personalized predictions
Objective: To determine if it is possible to generate a PAI and/or an XAI model that determines the endophenotypes most likely to respond to drug or placebo in patients with schizophrenia and bipolar disorder
Study Design: This study will apply the PAI model and XAI to data for the paliperidone and risperidone double blind placebo-controlled trials. This will be an analysis of previously published data only and the datasets will be those that already existing with the published inclusion and exclusion criteria
Participants: We will analyze all available data for all study participant-level results from the paliperidone and risperidone trials
Main Outcome Measure(s): The measures used in the paliperidone and risperidone double blind placebo-controlled trials
Statistical analysis: In each study dataset, subjects will be assigned to the drug or placebo indicated group based on the PAI model applied to the baseline data. For each indication subjects will be randomly assigned to a study arm, and the standardized difference between clinical endpoint means will be determined using Cohen’s D method.
The project’s goal is to determine if it is possible to generate a Personalized Advantage Index (PAI) and/or an explainable artificial intelligence (XAI) model that could help determine the endophenotypes that are most likely to respond to a therapeutic agent versus those more likely to respond to placebo in patients with schizophrenia and bipolar disorder. The project is significant because our preliminary results on depression trials have shown great promise in improving patient enrichment, but this approach has not yet been applied beyond depression trials. Thus, applying it more broadly could greatly help the scientific and medical fields to design more informative clinical trials and potentially revive interest in therapeutic agents discontinued for lack of apparent efficacy. Previous work from our team’s cutting-edge data science and machine learning approaches has focused on building analytical tools to support patient selection for targeted neurotherapeutic intervention. To generate unbiased yet interpretable insights on factors giving rise to treatment variability, we utilized XAI models and developed a rigorous data-driven approach to produce highly predictive and interpretable models of treatment responses under specific treatment options.
The initial data set analyzed was from a randomized placebo controlled double blind study of a study of a noicioceptin receptor antagonist as a potential treatment for Major Depressive Disorder (MDD) (study NEP-MDD-201). This study was negative based on the primary endpoint of the Montgomery-Åsberg Depression Rating Scale (MADRS) sum score. In a retrospective analysis, we built predictive models for each individual patient enrolled in the study, using only baseline data and setting the PAI threshold as a 4-point difference on MADRS at week 8 between both arms. The effect sizes (Cohen’s d) of drug vs placebo were 0.68 and 0.59 within treatment- and placebo-indicated subgroups, respectively, compared to 0.03 based on the Completers Analysis Set (CAS). We found that compact nested rule lists were sufficient to support explainable enrichment strategies.
Aims: we propose the following three specific aims in this research project:
1. Thoroughly test the PAI and XAI models on indications other than MDD and with pharmacological agents other than antidepressants to establish the validity of the method on a wider population. Testing our methods on schizophrenia and bipolar disorder is a critical step in validating and generalizing our method to other indications.
2. Systematically investigate the biomarkers that are associated with each of the indications and pharmacological agents/placebo.
3. Expand our current XAI models to simultaneously leverage multiple measurement modalities to predict treatment/placebo responses for a variety of indications and pharmacological agents.
Objective: to determine if it is possible to generate a PAI and/or an XAI model that could help determine the endophenotypes that are most likely to respond to a therapeutic agent versus those more likely to respond to placebo in patients with schizophrenia. This could help the scientific and medical fields to design more informative clinical trials and potentially revive interest in therapeutic agents discontinued for lack of apparent efficacy.
This study will apply the PAI model and XAI to data for the paliperidone and risperidone double blind placebo-controlled trials. This will be an analysis of previously published data only and the datasets will be those already existing with the published inclusion and exclusion criteria.
The main outcome measure of our PAI+XAI approach will be model evaluation metrics such as: model coefficients, accuracy, precision, recall, R2 (coefficient of determination), MSE (mean squared error). These will allow us to optimize our models to explain maximum variance, maximize accuracy, etc. The models’ coefficients allow us to determine which predictor variables are most informative for creation of the optimal XAI rule lists. In addition to these, we will also look at effect size and statistical tests of the measurements from the trial.
The main independent variables that will be used in the study is treatment assignment. One-hot-encoding will be performed on the treatment label that drug group will be 0.5 and placebo group will be -0.5.
Other variables of interest include demographics, clinical assessments and medical histories. The PAI model leverages causal inference techniques and predict patient’s end-point measures under each treatment option. In particular, we will look at the treatment and variable interaction to identify the effect per treatment group.
The main goal of the study is to expand our research effort to inform more efficient clinical trial designs targeting schizophrenia and bipolar disorder patients. Instead of traditional bio-statistical methods, our previously-developed Machine Learning (ML) approach, which will be applied to the currently proposed analysis, consisted of the following three major components:
1. We used an importance-guided forward selection technique to extract pre-treatment patient characteristics that are highly predictive of drug vs. placebo response from multiple data modalities (Importance-guided forward selection).
2. We improved existing algorithms to predict treatment outcome at the individual level based on the identified pre-treatment patient characteristics. By combining feature selection and treatment outcome predictions, we were able to identify subgroups of patients that showed a substantial differential response between drug and placebo [Personalized Advantage Index (PAI)2].
3. We utilized an in-house developed XAI algorithm which can generate highly interpretable rule lists from various data modalities (i.e. self-report, quantitative assessments to inform the basis of drug vs. placebo responders in study NEP-MDD-201 (Multivariate Correspondence Analysis (MCA) based rule mining for explainability).
Selection of the variables to include in the models
To identify the set of most predictive pre-treatment baseline features for treatment response within each arm (drug vs. placebo), we adopted the following importance-guided sequential model selection procedure. Two feature selection models were built, one for each arm (drug vs. placebo), to predict whether a given patient is going to be a responder at the end of the treatment (defined as at least 50% reduction in primary outcome score) based on the pre-treatment baseline patient characteristics. Specifically, we used logistic regression with elastic net regularization as our feature selection models. The elastic net regularization has been shown to be well-suited for problems where the number of features is much greater than the number of observations. The area under the receiver operating characteristics curve (AUC) was selected as the metric to quantify model performance.
Generation of the predicted end-point scores
Using multivariate linear regression modeling, we generated a prediction of the end-point score for each participant in each of the two treatment groups. To generate these predictions, we used 5-fold cross-validation to predict each individual’s end-point score over an average of 1000 repetitions. All independent variables were normalized using preprocessing modules from the Python library scikit-learn, where continuous measures were mean-centered, and dummy code values for used for categorical variables. For treatment groups, the drug group was set to 0.5 and placebo group to 0.5. The magnitude of the predicted difference was computed as an index of “predicted advantage” or PAI.
Construct interpretable rule lists
We previously used a Multiple Correspondence Analysis to mine rules from a high dimensional feature space and applied the Bayesian Rule Learning (BRL) framework to learn compact nested list of rules that can classify individual patient’s optimal treatment options.
The purpose of our study is to evaluate and generalize our approach to a wide spectrum of neurobehavioral disorders, as well as pharmacological agents, to design more informative clinical trials. From this, we will develop XAI models and the derived insights on personalized treatment which can be deployed as enrichment strategies in future clinical trials; therefore, allowing highly effective therapeutics be designed for targeted clinical populations with neurobehavioral disorders.
Estimation of key milestone dates for the proposed study, including:
• anticipated project start date = Jan-2020
• analysis completion date = Jan-2021
• date manuscript drafted = Sep-2020
• first submitted for publication = Oct-2020
• date results reported back to the YODA Project = Oct-2020
Plans for publishing results of data analysis are including but not limited to the following:
• Leading journals in the fields of machine learning, psychiatry, and neuroscience such as JAMA Psychiatry, Journal of Psychiatric Research , The Lancet Psychiatry, BMC Medical Informatics and Decision Making, and Schizophrenia Research
• Conferences: SOBP, ACNP, SfN, KDD, AI for Health Care
• If not accepted for journal publication then BioRxiv
1. Leucht, S. et al. Putting the efficacy of psychiatric and general medicine medication into perspective: review of meta-analyses. Br J Psychiatry 200, 97–106 (2012).
2. Webb, C. A. et al. Personalized prediction of antidepressant v. placebo response: evidence from the EMBARC study. Psychol Med 1–10 (2018).
3. DeRubeis, R. J. et al. The Personalized Advantage Index: Translating Research on Prediction into Individualized Treatment Recommendations. A Demonstration. Plos One 9, e83875 (2014).
4. Gao, Q. et al. MCA-based Rule Mining Enables Interpretable Inference in Clinical Psychiatry. In 2019 International Workshop on Health Intelligence. arXiv:1810.11558 (2019).
5. Zbozinek, T. D. et al. Diagnostic overlap of generalized anxiety disorder and major depressive disorder in a primary care sample. Depress Anxiety 29, 1065–1071 (2012).
6. Liu, Y. et al. Machine learning identifies large-scale reward-related activity modulated by dopaminergic enhancement in major depression. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging (2019).
7. Zou, H. et al Regularization and variable selection via the elastic net. J. R. Statist. Soc. B (2005). 67, Part 2, pp. 301–320.