2024-0980 - The YODA Project

                    array(41) {
  ["project_status"]=>
  string(7) "ongoing"
  ["project_assoc_trials"]=>
  array(2) {
    [0]=>
    object(WP_Post)#5537 (24) {
      ["ID"]=>
      int(14030)
      ["post_author"]=>
      string(4) "1638"
      ["post_date"]=>
      string(19) "2024-01-24 14:08:23"
      ["post_date_gmt"]=>
      string(19) "2024-01-24 19:08:23"
      ["post_content"]=>
      string(0) ""
      ["post_title"]=>
      string(208) "NCT04614948 - A Randomized, Double-blind, Placebo-controlled Phase 3 Study to Assess the Efficacy and Safety of Ad26.COV2.S for the Prevention of SARS-CoV-2-mediated COVID-19 in Adults Aged 18 Years and Older"
      ["post_excerpt"]=>
      string(0) ""
      ["post_status"]=>
      string(7) "publish"
      ["comment_status"]=>
      string(6) "closed"
      ["ping_status"]=>
      string(6) "closed"
      ["post_password"]=>
      string(0) ""
      ["post_name"]=>
      string(196) "nct04614948-a-randomized-double-blind-placebo-controlled-phase-3-study-to-assess-the-efficacy-and-safety-of-ad26-cov2-s-for-the-prevention-of-sars-cov-2-mediated-covid-19-in-adults-aged-18-years-a"
      ["to_ping"]=>
      string(0) ""
      ["pinged"]=>
      string(0) ""
      ["post_modified"]=>
      string(19) "2025-04-30 16:16:09"
      ["post_modified_gmt"]=>
      string(19) "2025-04-30 20:16:09"
      ["post_content_filtered"]=>
      string(0) ""
      ["post_parent"]=>
      int(0)
      ["guid"]=>
      string(60) "https://yoda.yale.edu/?post_type=clinical_trial&p=14030"
      ["menu_order"]=>
      int(0)
      ["post_type"]=>
      string(14) "clinical_trial"
      ["post_mime_type"]=>
      string(0) ""
      ["comment_count"]=>
      string(1) "0"
      ["filter"]=>
      string(3) "raw"
    }
    [1]=>
    object(WP_Post)#5538 (24) {
      ["ID"]=>
      int(1889)
      ["post_author"]=>
      string(4) "1363"
      ["post_date"]=>
      string(19) "2020-03-27 16:13:00"
      ["post_date_gmt"]=>
      string(19) "2020-03-27 16:13:00"
      ["post_content"]=>
      string(0) ""
      ["post_title"]=>
      string(181) "NCT00210925 - A Multicenter, Randomized, Double-Blind, Placebo-Controlled, Flexible Dose Study to Assess the Safety and Efficacy of Topiramate in the Treatment of Alcohol Dependence"
      ["post_excerpt"]=>
      string(0) ""
      ["post_status"]=>
      string(7) "publish"
      ["comment_status"]=>
      string(6) "closed"
      ["ping_status"]=>
      string(6) "closed"
      ["post_password"]=>
      string(0) ""
      ["post_name"]=>
      string(175) "nct00210925-a-multicenter-randomized-double-blind-placebo-controlled-flexible-dose-study-to-assess-the-safety-and-efficacy-of-topiramate-in-the-treatment-of-alcohol-dependence"
      ["to_ping"]=>
      string(0) ""
      ["pinged"]=>
      string(0) ""
      ["post_modified"]=>
      string(19) "2025-05-16 15:28:14"
      ["post_modified_gmt"]=>
      string(19) "2025-05-16 19:28:14"
      ["post_content_filtered"]=>
      string(0) ""
      ["post_parent"]=>
      int(0)
      ["guid"]=>
      string(224) "https://dev-yoda.pantheonsite.io/clinical-trial/nct00210925-a-multicenter-randomized-double-blind-placebo-controlled-flexible-dose-study-to-assess-the-safety-and-efficacy-of-topiramate-in-the-treatment-of-alcohol-dependence/"
      ["menu_order"]=>
      int(0)
      ["post_type"]=>
      string(14) "clinical_trial"
      ["post_mime_type"]=>
      string(0) ""
      ["comment_count"]=>
      string(1) "0"
      ["filter"]=>
      string(3) "raw"
    }
  }
  ["project_title"]=>
  string(105) "Can Quantum Machine Learning Accurately Predict the Therapeutic Outcomes of Drugs Based on Clinical Data?"
  ["project_narrative_summary"]=>
  string(661) "The predictive modelling of drug therapeutic outcomes is critical for advancing precision medicine and optimising patient care. While classical machine learning methods have demonstrated significant progress in this domain, they often face challenges with scalability and computational efficiency when processing complex clinical datasets. Quantum machine learning (QML) offers a novel approach, leveraging quantum computing's unique properties to address these limitations. This dissertation aims to investigate whether QML can accurately predict therapeutic outcomes of drugs using clinical data, and if so, how its performance compares to classical methods. "
  ["project_learn_source"]=>
  string(10) "web_search"
  ["principal_investigator"]=>
  array(7) {
    ["first_name"]=>
    string(6) "Oliver"
    ["last_name"]=>
    string(6) "Imhans"
    ["degree"]=>
    string(9) "Doctorate"
    ["primary_affiliation"]=>
    string(16) "Aspen University"
    ["email"]=>
    string(25) "oliverbell4ever@gmail.com"
    ["state_or_province"]=>
    string(2) "TX"
    ["country"]=>
    string(13) "United States"
  }
  ["project_key_personnel"]=>
  bool(false)
  ["project_ext_grants"]=>
  array(2) {
    ["value"]=>
    string(2) "no"
    ["label"]=>
    string(68) "No external grants or funds are being used to support this research."
  }
  ["project_date_type"]=>
  string(18) "full_crs_supp_docs"
  ["property_scientific_abstract"]=>
  string(1676) "Background

Predictive modeling of drug therapeutic outcomes is critical for advancing precision medicine and improving patient care. Traditional machine learning has made notable progress in this domain but struggles with scalability and computational efficiency when handling complex clinical datasets. Quantum machine learning (QML) offers a transformative approach, leveraging quantum computing's unique capabilities to address these challenges.

Objective

This study aims to evaluate the feasibility and accuracy of QML in predicting drug therapeutic outcomes using clinical data, comparing its performance with classical machine learning methods.

Study Design

The study employs a computational design, combining theoretical and empirical approaches. QML algorithms will be developed and benchmarked against classical machine learning models using curated clinical datasets.

Participants

De-identified clinical datasets representing diverse patient populations and therapeutic outcomes will be used. These data will be sourced from public repositories or institutional collaborations to ensure representativeness.

Outcome Measures

Primary: Predictive accuracy of QML algorithms.

Secondary: Computational efficiency, scalability, and robustness to noise and missing data.

Statistical Analysis

Performance metrics, including accuracy, precision, recall, F1-score, and ROC curve analysis, will evaluate predictive accuracy. Efficiency and scalability will be assessed by comparing runtime and resource utilization. Sensitivity analyses will determine robustness under various data conditions."
  ["project_brief_bg"]=>
  string(1229) "Background

Predicting therapeutic outcomes is essential for drug discovery, personalised medicine, and clinical decision-making. Machine learning (ML) has been instrumental in analysing high-dimensional clinical data; however, as datasets grow in size and complexity (e.g., integrating genomic, demographic, and environmental factors), classical ML algorithms face scalability and computational bottlenecks. Quantum machine learning, a promising field at the intersection of quantum computing and artificial intelligence, has the potential to overcome these limitations through quantum-enhanced data processing and optimisation.

Research Problem

The question this research seeks to address is: Can quantum machine learning provide accurate and computationally efficient predictions of therapeutic outcomes compared to classical machine learning approaches?

Significance of the Study

This study is among the first to explore QML applications in the domain of drug efficacy prediction, potentially bridging the gap between quantum computing and healthcare. The findings could accelerate drug development, enhance precision medicine, and pave the way for future interdisciplinary research.

"
  ["project_specific_aims"]=>
  string(1043) "Primary Objective

To evaluate the accuracy and efficiency of QML models in predicting therapeutic outcomes from clinical data.

Specific Objectives

1.	Develop and implement quantum machine learning models suitable for analyzing clinical data.

2.	Compare the performance of QML models against classical ML models on benchmark datasets.

3.	Identify clinical data types and features that are most suitable for quantum-enhanced predictions.

4.	Analyze the computational trade-offs and scalability of QML models in real-world healthcare scenarios.

Research Questions

1.	How does the prediction accuracy of QML models compare to that of classical ML models for therapeutic outcome prediction?

2.	What types of clinical data (e.g., genomic, demographic) are most effectively utilized by QML models?

3.	What are the computational challenges and benefits of deploying QML in this domain?

4.	How can hybrid quantum-classical models address current hardware limitations?

"
  ["project_study_design"]=>
  array(2) {
    ["value"]=>
    string(8) "meth_res"
    ["label"]=>
    string(23) "Methodological research"
  }
  ["project_purposes"]=>
  array(1) {
    [0]=>
    array(2) {
      ["value"]=>
      string(5) "other"
      ["label"]=>
      string(5) "Other"
    }
  }
  ["project_purposes_exp"]=>
  string(243) "Despite the potential of quantum machine learning , its applications in healthcare remain under-explored. There is a need for empirical studies comparing quantum machine learning  and classical machine learning in realistic clinical scenarios."
  ["project_research_methods"]=>
  string(1181) "Inclusion Criteria



All available de-identified individual participant-level data (IPD) from the requested trials will be included, provided they meet the necessary parameters for training machine learning models. Specifically:Participants with complete baseline and outcome data.

Relevant demographic and clinical characteristics required for feature engineering (e.g., age, gender, treatment data, laboratory results).

Exclusion Criteria: Participants with missing or incomplete key variables required for the machine learning model will be excluded.





Quantum Machine Learning Platform

The individual participant-level data (IPD) will be preprocessed and analyzed using a hybrid quantum-classical machine learning approach.



Quantum Tools: IBM Quantum for implementing quantum models.

Classical Tools: Python libraries such as TensorFlow Quantum or Scikit-learn for preprocessing, feature selection, and hybrid modeling.



Data will undergo preprocessing, including feature engineering and normalization, to ensure compatibility with quantum encodings such as amplitude or angle encoding."
  ["project_main_outcome_measure"]=>
  string(913) "Primary Outcome Measure(s)

The primary outcome measure reflects the accuracy and predictive performance of quantum machine learning (QML) models in predicting therapeutic outcomes based on individual participant-level clinical data.

List of Primary Outcomes

Predictive Accuracy: The accuracy of QML models in predicting therapeutic outcomes (e.g., response to treatment, progression-free survival).

Measurement: Evaluated using metrics such as Area Under the Receiver Operating Characteristic Curve (AUC-ROC), precision, recall, and F1 score.

Source: Derived from clinical trials provided by the YODA Project and benchmarked against classical machine learning models.

Model Robustness: Robustness of QML models in handling missing or noisy clinical data.

Measurement: Changes in predictive performance when introducing perturbations or missing data during testing."
  ["project_main_predictor_indep"]=>
  string(1183) "Main Independent Variable(s)

Drug Intervention: The type, dosage, or combination of drugs administered to participants in the clinical trials.

Operationalization: Coded as categorical variables indicating the specific drug(s) or treatment arms.

Example:

Variable Name: Drug_Type

Categories: Drug A, Drug B, Combination Therapy (A+B), Placebo.

Relevance: Assesses how different drug interventions independently affect therapeutic outcomes such as progression-free survival (PFS) or response rates.

Clinical and Demographic Features

Baseline clinical and demographic characteristics of the participants used as predictors in the quantum machine learning model.

Key Variables:

Age: Continuous variable (e.g., in years).

Gender: Categorical variable (e.g., Male, Female, Other).

Comorbidities: Binary or categorical (e.g., presence/absence or type of comorbidity).

Baseline Biomarkers: Continuous or categorical, depending on the type (e.g., blood pressure, cholesterol levels).

Relevance: Evaluates the independent contribution of these characteristics to predicting therapeutic outcomes."
  ["project_other_variables_interest"]=>
  string(1280) "Baseline Health Status: Health status or disease severity before the start of the intervention.

Operationalization:

Variable Name: Disease_Severity

Categories: Mild, Moderate, Severe (based on trial-specific scales or thresholds).

Relevance: Determines how initial health conditions influence the treatment effect and outcome predictions.

Trial-Specific Variables: Characteristics specific to individual clinical trials, such as duration of treatment or adherence rates.

Key Variables:

Treatment Duration: Continuous variable (e.g., in weeks or months).

Adherence: Binary or categorical variable (e.g., fully adherent, partially adherent, non-adherent).

Relevance: Assesses potential trial-specific effects on outcome measures.

Alignment with Outcome Measures

The independent variables are selected to:

Train the quantum machine learning model to predict therapeutic outcomes such as treatment response or progression-free survival.

Evaluate how each variable independently contributes to predictive accuracy, robustness, and model performance.

Comparison with Final Analysis

Consistency: The independent variables defined here will be consistently used throughout the study."
  ["project_stat_analysis_plan"]=>
  string(3697) "This analytic approach integrates study-specific variables from clinical trials NCT04614948 and NCT00210925 into a framework for evaluating quantum machine learning (QML) models to predict therapeutic outcomes. It emphasizes a structured Data Analysis Plan, combining traditional statistical methods with QML techniques for robust and interpretable results.

Data Analysis Plan

Preprocessing and Partitioning

Data will undergo rigorous preprocessing to address missing values (e.g., multiple imputations for continuous variables, mode imputation for categorical variables), standardize and normalize continuous variables (e.g., age, biomarkers), encode categorical variables (e.g., one-hot or label encoding), and handle outliers using interquartile range (IQR) or z-scores. Datasets will be split into training (70-80%) and testing (20-30%) subsets, with validation subsets for hyperparameter tuning as needed.

Descriptive and Bivariate Analysis

Continuous variables will be summarized using means, medians, and standard deviations, while categorical variables will be analyzed using frequencies and percentages. Visualizations such as histograms, bar charts, and boxplots will support exploratory analysis. Bivariate methods, such as correlation analysis (Pearson/Spearman), t-tests, chi-square tests, and non-parametric tests (e.g., Mann-Whitney U), will explore relationships between predictors and outcomes.

Multivariable and Advanced Analysis

Regression models (logistic, linear, and Cox proportional hazards) will evaluate the independent effects of variables on primary and secondary outcomes. Advanced techniques like propensity score matching will reduce confounding, while Kaplan-Meier curves and log-rank tests will be used for time-to-event analyses. Cox models will adjust for covariates, and dose-response analyses will assess therapeutic impacts.

Study-Specific Key Variables

For NCT04614948 (COVID-19 vaccine), outcome variables include vaccine efficacy (e.g., prevention of symptomatic COVID-19), immunological response (e.g., antibody titers), and adverse events. Covariates include demographics (age, gender), baseline health (comorbidities, prior infections), and vaccination characteristics (dosing schedules).

For NCT00210925 (alcohol dependence treatment), outcomes focus on the percentage of heavy drinking days, quality of life changes, and treatment-related adverse events. Covariates include demographics (age, socio-economic factors), treatment adherence, and baseline drinking patterns.

Quantum Machine Learning

QML models will encode clinical data using amplitude or angle encoding, leveraging hybrid quantum-classical platforms like TensorFlow Quantum. Feature selection will utilize quantum-enhanced techniques to identify predictive variables. QML models will be evaluated against classical machine learning methods (e.g., Random Forest, SVM) using metrics such as AUC-ROC, F1-score, and precision. Validation will include k-fold cross-validation and sensitivity analyses to test model robustness.

Expected Insights

Key insights include understanding factors influencing vaccine efficacy (NCT04614948) and evaluating topiramate’s effects on alcohol dependence outcomes (NCT00210925). QML’s predictive accuracy and feature selection capabilities will also be assessed. Results will be documented through visualizations (e.g., survival curves, feature importance rankings) and statistical summaries to ensure transparency and applicability to clinical decision-making. This approach aligns QML analysis with study-specific data, maximizing the utility of findings."
  ["project_software_used"]=>
  array(1) {
    [0]=>
    array(2) {
      ["value"]=>
      string(1) "r"
      ["label"]=>
      string(1) "R"
    }
  }
  ["project_timeline"]=>
  string(809) "Key Milestone Dates



1. Anticipated Project Start Date: January 7, 2025. Data access granted, project initiation, and initial coordination. 



2. Data Preparation and Preprocessing: February 15, 2025



3. Exploratory data analysis to understand distributions, trends, and relationships: March 15, 2025



4. Descriptive, bivariate, and multivariable analyses. Development, training, and validation of quantum machine learning models. Benchmarking QML against classical methods.   April 15, 2025



5. Drafting the manuscript, including methodology, results, and discussion. August 15, 2025



6. Initial review by collaborators and advisors. October 15, 2025



7. First Submission for Publication Date: December 15, 2025"
  ["project_dissemination_plan"]=>
  string(797) "Primary Manuscript

Title (tentative): "Evaluating the Predictive Power of Quantum Machine Learning for Therapeutic Outcomes Using Clinical Trial Data."

Content: Comprehensive description of the methodology, analysis, results, and implications of using quantum machine 



Present findings at national and international conferences related to: Quantum computing, Clinical trials and healthcare analytics.



A summary report tailored for non-technical stakeholders (e.g., healthcare providers, policy-makers) explaining the potential of QML in precision medicine.



Code or algorithms developed during the study (subject to permissions and ethical guidelines) may be shared via platforms like GitHub to encourage reproducibility and collaboration."
  ["project_bibliography"]=>
  string(3170) "Bibliography 
Gircha, A. I., Boev, A. S., Avchaciov, K., Fedichev, P. O., & Fedorov, A. K. (2023). Hybrid quantum-classical machine learning for generative chemistry and drug design. Scientific Reports. https://doi.org/10.1038/s41598-023-32703-4

Summary:

This study introduces a hybrid quantum-classical machine learning model that integrates a discrete variational autoencoder (DVAE) with a restricted Boltzmann machine (RBM) in the latent space. Trained on the ChEMBL dataset, the model successfully generated novel drug-like molecules, demonstrating feasibility on existing quantum annealing devices like the D-Wave Advantage. It highlights potential improvements in generative chemistry and drug design through hybrid quantum-classical approaches.
Li, W., Yin, Z., Li, X., Ma, D., Yi, S., Zhang, Z., Zou, C., Bu, K., Dai, M., Yue, J., Chen, Y., Zhang, X., & Zhang, S. (2024). A hybrid quantum computing pipeline for real world drug discovery. Scientific Reports. https://doi.org/10.1038/s41598-024-67897-8

Summary:

This paper presents a hybrid quantum computing pipeline addressing real-world drug discovery tasks, including Gibbs free energy calculations for prodrug activation and covalent bond interaction simulations. Leveraging the Variational Quantum Eigensolver (VQE) framework, the study showcases the pipeline’s potential for complex molecular simulations and its application in KRAS inhibitor design, marking significant progress in quantum-enhanced drug discovery workflows.
Domingo, L. (2024). A hybrid quantum-classical fusion neural network to improve protein-ligand binding affinity predictions for drug discovery. arXiv.org. https://arxiv.org/abs/2309.03919v3

Summary:

The study proposes a hybrid quantum-classical neural network that combines 3D and spatial graph convolutional neural networks within a quantum architecture. This model improves protein-ligand binding affinity prediction by 6% over classical approaches, offering more stable convergence and enhanced performance, showcasing the potential of quantum-classical integration in drug discovery applications.
Klaus, H. (2024). Hybrid Quantum-Classical Machine Learning for Drug Discovery. EasyChair Preprint. https://easychair.org/publications/preprint/rK4k

Summary:

This research explores hybrid quantum-classical machine learning techniques to accelerate drug candidate identification and optimization. Combining quantum computing for molecular simulations with classical algorithms for data analysis, the study aims to overcome limitations in current methods, advancing drug discovery by improving molecular property predictions and interaction simulations.
 
"
  ["project_suppl_material"]=>
  bool(false)
  ["project_coi"]=>
  array(1) {
    [0]=>
    array(1) {
      ["file_coi"]=>
      array(21) {
        ["ID"]=>
        int(16234)
        ["id"]=>
        int(16234)
        ["title"]=>
        string(24) "Conflict-of-Interest.pdf"
        ["filename"]=>
        string(24) "Conflict-of-Interest.pdf"
        ["filesize"]=>
        int(20502)
        ["url"]=>
        string(73) "https://yoda.yale.edu/wp-content/uploads/2024/12/Conflict-of-Interest.pdf"
        ["link"]=>
        string(72) "https://yoda.yale.edu/data-request/2024-0980/conflict-of-interest-pdf-4/"
        ["alt"]=>
        string(0) ""
        ["author"]=>
        string(4) "1962"
        ["description"]=>
        string(0) ""
        ["caption"]=>
        string(0) ""
        ["name"]=>
        string(26) "conflict-of-interest-pdf-4"
        ["status"]=>
        string(7) "inherit"
        ["uploaded_to"]=>
        int(16113)
        ["date"]=>
        string(19) "2024-12-12 19:41:57"
        ["modified"]=>
        string(19) "2024-12-12 19:41:59"
        ["menu_order"]=>
        int(0)
        ["mime_type"]=>
        string(15) "application/pdf"
        ["type"]=>
        string(11) "application"
        ["subtype"]=>
        string(3) "pdf"
        ["icon"]=>
        string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
      }
    }
  }
  ["data_use_agreement_training"]=>
  bool(true)
  ["human_research_protection_training"]=>
  bool(true)
  ["certification"]=>
  bool(true)
  ["search_order"]=>
  string(1) "0"
  ["project_send_email_updates"]=>
  bool(false)
  ["project_publ_available"]=>
  bool(true)
  ["project_year_access"]=>
  string(4) "2025"
  ["project_rep_publ"]=>
  bool(false)
  ["project_assoc_data"]=>
  array(0) {
  }
  ["project_due_dil_assessment"]=>
  array(21) {
    ["ID"]=>
    int(16631)
    ["id"]=>
    int(16631)
    ["title"]=>
    string(47) "YODA Project Due Diligence Assessment 2024-0980"
    ["filename"]=>
    string(51) "YODA-Project-Due-Diligence-Assessment-2024-0980.pdf"
    ["filesize"]=>
    int(109925)
    ["url"]=>
    string(100) "https://yoda.yale.edu/wp-content/uploads/2024/11/YODA-Project-Due-Diligence-Assessment-2024-0980.pdf"
    ["link"]=>
    string(93) "https://yoda.yale.edu/data-request/2024-0980/yoda-project-due-diligence-assessment-2024-0980/"
    ["alt"]=>
    string(0) ""
    ["author"]=>
    string(4) "1885"
    ["description"]=>
    string(0) ""
    ["caption"]=>
    string(0) ""
    ["name"]=>
    string(47) "yoda-project-due-diligence-assessment-2024-0980"
    ["status"]=>
    string(7) "inherit"
    ["uploaded_to"]=>
    int(16113)
    ["date"]=>
    string(19) "2025-02-12 15:40:44"
    ["modified"]=>
    string(19) "2025-02-12 15:40:44"
    ["menu_order"]=>
    int(0)
    ["mime_type"]=>
    string(15) "application/pdf"
    ["type"]=>
    string(11) "application"
    ["subtype"]=>
    string(3) "pdf"
    ["icon"]=>
    string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
  }
  ["project_title_link"]=>
  array(21) {
    ["ID"]=>
    int(16632)
    ["id"]=>
    int(16632)
    ["title"]=>
    string(42) "YODA Project Protocol 2024-0980 - 25-01-21"
    ["filename"]=>
    string(44) "YODA-Project-Protocol-2024-0980-25-01-21.pdf"
    ["filesize"]=>
    int(181487)
    ["url"]=>
    string(93) "https://yoda.yale.edu/wp-content/uploads/2024/11/YODA-Project-Protocol-2024-0980-25-01-21.pdf"
    ["link"]=>
    string(86) "https://yoda.yale.edu/data-request/2024-0980/yoda-project-protocol-2024-0980-25-01-21/"
    ["alt"]=>
    string(0) ""
    ["author"]=>
    string(4) "1885"
    ["description"]=>
    string(0) ""
    ["caption"]=>
    string(0) ""
    ["name"]=>
    string(40) "yoda-project-protocol-2024-0980-25-01-21"
    ["status"]=>
    string(7) "inherit"
    ["uploaded_to"]=>
    int(16113)
    ["date"]=>
    string(19) "2025-02-12 15:40:59"
    ["modified"]=>
    string(19) "2025-02-12 15:40:59"
    ["menu_order"]=>
    int(0)
    ["mime_type"]=>
    string(15) "application/pdf"
    ["type"]=>
    string(11) "application"
    ["subtype"]=>
    string(3) "pdf"
    ["icon"]=>
    string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
  }
  ["project_review_link"]=>
  array(21) {
    ["ID"]=>
    int(16633)
    ["id"]=>
    int(16633)
    ["title"]=>
    string(36) "YODA Project Review - 2024-0980_SITE"
    ["filename"]=>
    string(38) "YODA-Project-Review-2024-0980_SITE.pdf"
    ["filesize"]=>
    int(1879750)
    ["url"]=>
    string(87) "https://yoda.yale.edu/wp-content/uploads/2024/11/YODA-Project-Review-2024-0980_SITE.pdf"
    ["link"]=>
    string(80) "https://yoda.yale.edu/data-request/2024-0980/yoda-project-review-2024-0980_site/"
    ["alt"]=>
    string(0) ""
    ["author"]=>
    string(4) "1885"
    ["description"]=>
    string(0) ""
    ["caption"]=>
    string(0) ""
    ["name"]=>
    string(34) "yoda-project-review-2024-0980_site"
    ["status"]=>
    string(7) "inherit"
    ["uploaded_to"]=>
    int(16113)
    ["date"]=>
    string(19) "2025-02-12 15:41:19"
    ["modified"]=>
    string(19) "2025-02-12 15:41:19"
    ["menu_order"]=>
    int(0)
    ["mime_type"]=>
    string(15) "application/pdf"
    ["type"]=>
    string(11) "application"
    ["subtype"]=>
    string(3) "pdf"
    ["icon"]=>
    string(62) "https://yoda.yale.edu/wp/wp-includes/images/media/document.png"
  }
  ["project_highlight_button"]=>
  string(0) ""
  ["request_data_partner"]=>
  string(15) "johnson-johnson"
  ["request_overridden_res"]=>
  string(1) "3"
}
data partner
array(1) {
  [0]=>
  string(15) "johnson-johnson"
}


pi country
array(0) {
}


pi affil
array(0) {
}


products
array(2) {
  [0]=>
  string(24) "janssen-covid-19-vaccine"
  [1]=>
  string(7) "topamax"
}


num of trials
array(1) {
  [0]=>
  string(1) "2"
}


res
array(1) {
  [0]=>
  string(1) "3"
}

General Information

How did you learn about the YODA Project?: Internet Search

Conflict of Interest

Conflict-of-Interest.pdf

Request Clinical Trials

Associated Trial(s):

What type of data are you looking for?: Individual Participant-Level Data, which includes Full CSR and all supporting documentation

Request Clinical Trials

Data Request Status

Status: Ongoing

Research Proposal

Project Title: Can Quantum Machine Learning Accurately Predict the Therapeutic Outcomes of Drugs Based on Clinical Data?

Scientific Abstract: Background
Predictive modeling of drug therapeutic outcomes is critical for advancing precision medicine and improving patient care. Traditional machine learning has made notable progress in this domain but struggles with scalability and computational efficiency when handling complex clinical datasets. Quantum machine learning (QML) offers a transformative approach, leveraging quantum computing's unique capabilities to address these challenges.
Objective
This study aims to evaluate the feasibility and accuracy of QML in predicting drug therapeutic outcomes using clinical data, comparing its performance with classical machine learning methods.
Study Design
The study employs a computational design, combining theoretical and empirical approaches. QML algorithms will be developed and benchmarked against classical machine learning models using curated clinical datasets.
Participants
De-identified clinical datasets representing diverse patient populations and therapeutic outcomes will be used. These data will be sourced from public repositories or institutional collaborations to ensure representativeness.
Outcome Measures
Primary: Predictive accuracy of QML algorithms.
Secondary: Computational efficiency, scalability, and robustness to noise and missing data.
Statistical Analysis
Performance metrics, including accuracy, precision, recall, F1-score, and ROC curve analysis, will evaluate predictive accuracy. Efficiency and scalability will be assessed by comparing runtime and resource utilization. Sensitivity analyses will determine robustness under various data conditions.

Brief Project Background and Statement of Project Significance: Background
Predicting therapeutic outcomes is essential for drug discovery, personalised medicine, and clinical decision-making. Machine learning (ML) has been instrumental in analysing high-dimensional clinical data; however, as datasets grow in size and complexity (e.g., integrating genomic, demographic, and environmental factors), classical ML algorithms face scalability and computational bottlenecks. Quantum machine learning, a promising field at the intersection of quantum computing and artificial intelligence, has the potential to overcome these limitations through quantum-enhanced data processing and optimisation.
Research Problem
The question this research seeks to address is: Can quantum machine learning provide accurate and computationally efficient predictions of therapeutic outcomes compared to classical machine learning approaches?
Significance of the Study
This study is among the first to explore QML applications in the domain of drug efficacy prediction, potentially bridging the gap between quantum computing and healthcare. The findings could accelerate drug development, enhance precision medicine, and pave the way for future interdisciplinary research.

Specific Aims of the Project: Primary Objective
To evaluate the accuracy and efficiency of QML models in predicting therapeutic outcomes from clinical data.
Specific Objectives
1. Develop and implement quantum machine learning models suitable for analyzing clinical data.
2. Compare the performance of QML models against classical ML models on benchmark datasets.
3. Identify clinical data types and features that are most suitable for quantum-enhanced predictions.
4. Analyze the computational trade-offs and scalability of QML models in real-world healthcare scenarios.
Research Questions
1. How does the prediction accuracy of QML models compare to that of classical ML models for therapeutic outcome prediction?
2. What types of clinical data (e.g., genomic, demographic) are most effectively utilized by QML models?
3. What are the computational challenges and benefits of deploying QML in this domain?
4. How can hybrid quantum-classical models address current hardware limitations?

Study Design: Methodological research

What is the purpose of the analysis being proposed? Please select all that apply.: Other

Software Used: R

Data Source and Inclusion/Exclusion Criteria to be used to define the patient sample for your study: Inclusion Criteria

All available de-identified individual participant-level data (IPD) from the requested trials will be included, provided they meet the necessary parameters for training machine learning models. Specifically:Participants with complete baseline and outcome data.
Relevant demographic and clinical characteristics required for feature engineering (e.g., age, gender, treatment data, laboratory results).
Exclusion Criteria: Participants with missing or incomplete key variables required for the machine learning model will be excluded.

Quantum Machine Learning Platform
The individual participant-level data (IPD) will be preprocessed and analyzed using a hybrid quantum-classical machine learning approach.

Quantum Tools: IBM Quantum for implementing quantum models.
Classical Tools: Python libraries such as TensorFlow Quantum or Scikit-learn for preprocessing, feature selection, and hybrid modeling.

Data will undergo preprocessing, including feature engineering and normalization, to ensure compatibility with quantum encodings such as amplitude or angle encoding.

Primary and Secondary Outcome Measure(s) and how they will be categorized/defined for your study: Primary Outcome Measure(s)
The primary outcome measure reflects the accuracy and predictive performance of quantum machine learning (QML) models in predicting therapeutic outcomes based on individual participant-level clinical data.
List of Primary Outcomes
Predictive Accuracy: The accuracy of QML models in predicting therapeutic outcomes (e.g., response to treatment, progression-free survival).
Measurement: Evaluated using metrics such as Area Under the Receiver Operating Characteristic Curve (AUC-ROC), precision, recall, and F1 score.
Source: Derived from clinical trials provided by the YODA Project and benchmarked against classical machine learning models.
Model Robustness: Robustness of QML models in handling missing or noisy clinical data.
Measurement: Changes in predictive performance when introducing perturbations or missing data during testing.

Main Predictor/Independent Variable and how it will be categorized/defined for your study: Main Independent Variable(s)
Drug Intervention: The type, dosage, or combination of drugs administered to participants in the clinical trials.
Operationalization: Coded as categorical variables indicating the specific drug(s) or treatment arms.
Example:
Variable Name: Drug_Type
Categories: Drug A, Drug B, Combination Therapy (A+B), Placebo.
Relevance: Assesses how different drug interventions independently affect therapeutic outcomes such as progression-free survival (PFS) or response rates.
Clinical and Demographic Features
Baseline clinical and demographic characteristics of the participants used as predictors in the quantum machine learning model.
Key Variables:
Age: Continuous variable (e.g., in years).
Gender: Categorical variable (e.g., Male, Female, Other).
Comorbidities: Binary or categorical (e.g., presence/absence or type of comorbidity).
Baseline Biomarkers: Continuous or categorical, depending on the type (e.g., blood pressure, cholesterol levels).
Relevance: Evaluates the independent contribution of these characteristics to predicting therapeutic outcomes.

Other Variables of Interest that will be used in your analysis and how they will be categorized/defined for your study: Baseline Health Status: Health status or disease severity before the start of the intervention.
Operationalization:
Variable Name: Disease_Severity
Categories: Mild, Moderate, Severe (based on trial-specific scales or thresholds).
Relevance: Determines how initial health conditions influence the treatment effect and outcome predictions.
Trial-Specific Variables: Characteristics specific to individual clinical trials, such as duration of treatment or adherence rates.
Key Variables:
Treatment Duration: Continuous variable (e.g., in weeks or months).
Adherence: Binary or categorical variable (e.g., fully adherent, partially adherent, non-adherent).
Relevance: Assesses potential trial-specific effects on outcome measures.
Alignment with Outcome Measures
The independent variables are selected to:
Train the quantum machine learning model to predict therapeutic outcomes such as treatment response or progression-free survival.
Evaluate how each variable independently contributes to predictive accuracy, robustness, and model performance.
Comparison with Final Analysis
Consistency: The independent variables defined here will be consistently used throughout the study.

Statistical Analysis Plan: This analytic approach integrates study-specific variables from clinical trials NCT04614948 and NCT00210925 into a framework for evaluating quantum machine learning (QML) models to predict therapeutic outcomes. It emphasizes a structured Data Analysis Plan, combining traditional statistical methods with QML techniques for robust and interpretable results.
Data Analysis Plan
Preprocessing and Partitioning
Data will undergo rigorous preprocessing to address missing values (e.g., multiple imputations for continuous variables, mode imputation for categorical variables), standardize and normalize continuous variables (e.g., age, biomarkers), encode categorical variables (e.g., one-hot or label encoding), and handle outliers using interquartile range (IQR) or z-scores. Datasets will be split into training (70-80%) and testing (20-30%) subsets, with validation subsets for hyperparameter tuning as needed.
Descriptive and Bivariate Analysis
Continuous variables will be summarized using means, medians, and standard deviations, while categorical variables will be analyzed using frequencies and percentages. Visualizations such as histograms, bar charts, and boxplots will support exploratory analysis. Bivariate methods, such as correlation analysis (Pearson/Spearman), t-tests, chi-square tests, and non-parametric tests (e.g., Mann-Whitney U), will explore relationships between predictors and outcomes.
Multivariable and Advanced Analysis
Regression models (logistic, linear, and Cox proportional hazards) will evaluate the independent effects of variables on primary and secondary outcomes. Advanced techniques like propensity score matching will reduce confounding, while Kaplan-Meier curves and log-rank tests will be used for time-to-event analyses. Cox models will adjust for covariates, and dose-response analyses will assess therapeutic impacts.
Study-Specific Key Variables
For NCT04614948 (COVID-19 vaccine), outcome variables include vaccine efficacy (e.g., prevention of symptomatic COVID-19), immunological response (e.g., antibody titers), and adverse events. Covariates include demographics (age, gender), baseline health (comorbidities, prior infections), and vaccination characteristics (dosing schedules).
For NCT00210925 (alcohol dependence treatment), outcomes focus on the percentage of heavy drinking days, quality of life changes, and treatment-related adverse events. Covariates include demographics (age, socio-economic factors), treatment adherence, and baseline drinking patterns.
Quantum Machine Learning
QML models will encode clinical data using amplitude or angle encoding, leveraging hybrid quantum-classical platforms like TensorFlow Quantum. Feature selection will utilize quantum-enhanced techniques to identify predictive variables. QML models will be evaluated against classical machine learning methods (e.g., Random Forest, SVM) using metrics such as AUC-ROC, F1-score, and precision. Validation will include k-fold cross-validation and sensitivity analyses to test model robustness.
Expected Insights
Key insights include understanding factors influencing vaccine efficacy (NCT04614948) and evaluating topiramate's effects on alcohol dependence outcomes (NCT00210925). QML's predictive accuracy and feature selection capabilities will also be assessed. Results will be documented through visualizations (e.g., survival curves, feature importance rankings) and statistical summaries to ensure transparency and applicability to clinical decision-making. This approach aligns QML analysis with study-specific data, maximizing the utility of findings.

Narrative Summary: The predictive modelling of drug therapeutic outcomes is critical for advancing precision medicine and optimising patient care. While classical machine learning methods have demonstrated significant progress in this domain, they often face challenges with scalability and computational efficiency when processing complex clinical datasets. Quantum machine learning (QML) offers a novel approach, leveraging quantum computing's unique properties to address these limitations. This dissertation aims to investigate whether QML can accurately predict therapeutic outcomes of drugs using clinical data, and if so, how its performance compares to classical methods.

Project Timeline: Key Milestone Dates

1. Anticipated Project Start Date: January 7, 2025. Data access granted, project initiation, and initial coordination.

2. Data Preparation and Preprocessing: February 15, 2025

3. Exploratory data analysis to understand distributions, trends, and relationships: March 15, 2025

4. Descriptive, bivariate, and multivariable analyses. Development, training, and validation of quantum machine learning models. Benchmarking QML against classical methods. April 15, 2025

5. Drafting the manuscript, including methodology, results, and discussion. August 15, 2025

6. Initial review by collaborators and advisors. October 15, 2025

7. First Submission for Publication Date: December 15, 2025

Dissemination Plan: Primary Manuscript
Title (tentative): "Evaluating the Predictive Power of Quantum Machine Learning for Therapeutic Outcomes Using Clinical Trial Data."
Content: Comprehensive description of the methodology, analysis, results, and implications of using quantum machine

Present findings at national and international conferences related to: Quantum computing, Clinical trials and healthcare analytics.

A summary report tailored for non-technical stakeholders (e.g., healthcare providers, policy-makers) explaining the potential of QML in precision medicine.

Code or algorithms developed during the study (subject to permissions and ethical guidelines) may be shared via platforms like GitHub to encourage reproducibility and collaboration.

Bibliography:

Bibliography

Gircha, A. I., Boev, A. S., Avchaciov, K., Fedichev, P. O., & Fedorov, A. K. (2023). Hybrid quantum-classical machine learning for generative chemistry and drug design. Scientific Reports. https://doi.org/10.1038/s41598-023-32703-4
Summary:
This study introduces a hybrid quantum-classical machine learning model that integrates a discrete variational autoencoder (DVAE) with a restricted Boltzmann machine (RBM) in the latent space. Trained on the ChEMBL dataset, the model successfully generated novel drug-like molecules, demonstrating feasibility on existing quantum annealing devices like the D-Wave Advantage. It highlights potential improvements in generative chemistry and drug design through hybrid quantum-classical approaches.

Li, W., Yin, Z., Li, X., Ma, D., Yi, S., Zhang, Z., Zou, C., Bu, K., Dai, M., Yue, J., Chen, Y., Zhang, X., & Zhang, S. (2024). A hybrid quantum computing pipeline for real world drug discovery. Scientific Reports. https://doi.org/10.1038/s41598-024-67897-8
Summary:
This paper presents a hybrid quantum computing pipeline addressing real-world drug discovery tasks, including Gibbs free energy calculations for prodrug activation and covalent bond interaction simulations. Leveraging the Variational Quantum Eigensolver (VQE) framework, the study showcases the pipeline’s potential for complex molecular simulations and its application in KRAS inhibitor design, marking significant progress in quantum-enhanced drug discovery workflows.

Domingo, L. (2024). A hybrid quantum-classical fusion neural network to improve protein-ligand binding affinity predictions for drug discovery. arXiv.org. https://arxiv.org/abs/2309.03919v3
Summary:
The study proposes a hybrid quantum-classical neural network that combines 3D and spatial graph convolutional neural networks within a quantum architecture. This model improves protein-ligand binding affinity prediction by 6% over classical approaches, offering more stable convergence and enhanced performance, showcasing the potential of quantum-classical integration in drug discovery applications.

Klaus, H. (2024). Hybrid Quantum-Classical Machine Learning for Drug Discovery. EasyChair Preprint. https://easychair.org/publications/preprint/rK4k
Summary:
This research explores hybrid quantum-classical machine learning techniques to accelerate drug candidate identification and optimization. Combining quantum computing for molecular simulations with classical algorithms for data analysis, the study aims to overcome limitations in current methods, advancing drug discovery by improving molecular property predictions and interaction simulations.