Skip to main content


Research Proposal

Project Title: 
Evidence-generation for biologics in pediatric studies
Scientific Abstract: 

Background: With the emergence of biologics over the past 15 years, substantial advances have been made in the treatment of a number of pediatric diseases. However, evidence-generation for these new therapies is challenging due to issues in conducting randomized clinical trials (RCT) in children. New statistical tools to improve the efficiency and feasibility of evidence-generation in pediatric studies are needed.
Objective: The project aims to develop new statistical tools to evaluate the efficacy of biologics in pediatric populations by leveraging both RCT and observational data.
Study Design: We will use data from requested RCTs and electronic health record (EHR) data from local clinical institutions to develop and validate new methods. Specifically, we will develop new statistical tools to (i) identify and evaluate early treatment endpoints supporting shorter duration of RCTs in pediatric populations; (ii) project treatment effects on pediatric populations based on the relevant EHR data and RCTs conducted in adults.
Participants: All enrolled patients in the requested trials.
Main Outcome Measures: For each requested pediatric study, we will report (i) the identified early treatment endpoint and the proportion of treatment effect the identified early treatment endpoint can explain; (ii) the treatment effect in children projected from the relevant EHR data and RCTs conducted in adults.
Statistical Analysis: We will develop new statistical methods for evidence-generation in pediatric studies, and apply and validate the proposed methods using the requested studies.

Brief Project Background and Statement of Project Significance: 

The impact of pediatric drug therapies has dramatically changed over the past 15 years with the emergence of biologics. For example, there is a growing interest in the development and use of anti-tumor necrosis factor (anti-TNF) therapy in children (McCluggage, 2011). Whereas treatment used to aim for reduction of symptoms in certain conditions such as inflammatory bowel disease, anti-TNF therapy can help heal the mucosa, eliminate symptoms, and modify the natural course of the disease. Anti-TNF therapy is of potential benefit in many pediatric diseases. In 2014, the U.S. Food and Drug Administration (FDA) approved adalimumab, an anti-TNF agent, for the treatment of pediatric patients with Crohn’s disease after the initial approval of adult patients in 2007 based on follow-up clinical studies confirming the safety and efficacy of adalimumab in pediatric patients in 2012 (Patel et al., 2016). In general, many substantial advances have been made in pharmacological therapy in pediatric populations and Rose (2019) showed that FDA reviewed 130 pediatric drug therapies from 2007 to 2011.
How to efficiently evaluate the safety and efficacy of new therapies such as biologics in pediatric patients is a critical question with substantial impacts on the drug development and regulatory process for children. Most biologic therapies used in pediatric populations, including adalimumab, are initially studied in adults, and often lack sufficient evidence to support their efficacy in pediatric patients. For example, infliximab, another anti-TNF therapy, was FDA-approved for use in adult patients with rheumatoid arthritis in 1999. Since then, despite some evidence indicating it is efficacious and safe in juvenile rheumatoid arthritis (JRA) patients, infliximab has still not been approved by the FDA for use in JRA due to a lack of sufficient evidence (Stall and Cron, 2014).
Although randomized controlled trials (RCTs) remain the gold standard for drug evidence-generation, solely relying on RCTs to evaluate drug safety and efficacy is often not feasible for pediatric populations due to a number of disincentives and ethical challenges (McMahon and Dal Pan (2018)). Barriers to conducting pediatric trials include small patient populations with slow and costly trial accrual, liability and complex ethical issues related to testing products in vulnerable patient populations, practical challenges in obtaining consent and conducting trials in children (e.g. need for pediatric drug formulations), and lack of validated pediatric assessment tools and clinical end-points. In addition, in children, long-term follow-up is often required to assess treatment effects across multiple stages of development and to measure adverse events related to growth and development. This type of follow up is often impractical and costly. As a result, there are substantial gaps in evidence on the safety and efficacy of many newly developed biologic drugs in children as in the example of infliximab.

Specific Aims of the Project: 

The specific aims of the project are:
(i) to identify early treatment endpoints supporting shorter duration of RCTs in pediatric populations and quantify the extent to which these endpoints can approximate gold standard long-term endpoints
(ii) to evaluate the feasibility of projecting treatment effects on pediatric populations based on EHR data and RCTs conducted in adult populations.
The overarching goal is to develop new statistical tools to make evidence-generation for biologics in pediatric populations more efficient and feasible and validate the proposed tools using the RCT data requested through YODA and EHR data from Boston Children’s Hospital (BCH).

What is the purpose of the analysis being proposed? Please select all that apply.: 
New research question to examine treatment effectiveness on secondary endpoints and/or within subgroup populations
Confirm or validate previously conducted research on treatment effectiveness
Develop or refine statistical methods
Research on clinical trial methods
Software Used: 
Data Source and Inclusion/Exclusion Criteria to be used to define the patient sample for your study: 

We will request pediatric RCTs studying biologics and the relevant adult RCTs through YODA. For each RCT we request, we aim to use individual-level participant data (IPD) with demographic and clinical baseline information, treatment information as well as clinical outcomes including treatment response. For example, we wish to collect all the subcomponents of American College of Rheumatology (ACR) score in the requested JRA studies so that we can easily change from ACR score to other outcome measures used in JRA, such as Juvenile Arithmetic Disease Activity Scores (JADAS), in our study if necessary.
In our study, we will also include EHR data for the observational pediatric cohort relevant to the pediatric populations studied in the requested RCTs and treated at Boston Children’s Hospital (BCH). These data will be obtained separately through our local institutions.
The YODA data and the EHR data may be stored in their own respective servers. To integrate information from the two data sources, we will derive summary-level data from YODA such as regression coefficients and predicted treatment response curve given a propensity score of response as detailed in the statistical plan.

Main Outcome Measure and how it will be categorized/defined for your study: 

For each pediatric study of biologics we request, the main outcome measures will be:
1. the identified early treatment endpoint and the proportion of treatment effect on the gold standard long-term endpoint the identified early treatment endpoint can explain (PTE);
2. the projected treatment effect based on the relevant EHR data and potentially RCTs conducted in adults.

Main Predictor/Independent Variable and how it will be categorized/defined for your study: 


Other Variables of Interest that will be used in your analysis and how they will be categorized/defined for your study: 


Statistical Analysis Plan: 

We will develop statistical methods to make evidence-generation of biologics in pediatric studies more feasible and efficient.
Objective 1. To identify and evaluate early treatment endpoints supporting shorter duration of pediatric RCTs, we will develop robust model-free statistical methods to identify surrogate endpoints and quantify its degree of surrogacy. Surrogate endpoints to be considered include both qualitative measurements of treatment response and event time outcomes such as progression free survival at an earlier time (Wang et al., 2019). Candidate surrogate endpoints will be selected in collaboration with a pediatric rheumatologist. We will develop methods that allow us to estimate the degree of surrogacy using EHR data, and validate the proposed methods using the long-term endpoints in RCT data for the pediatric studies of biologics we are requesting.
The proposed analysis does not require individual-level data transportation across servers where RCT data and EHR data are stored. In particular, we will only use EHR data to identify the surrogate endpoint, and only use RCT data for validation.
Objective 2. To evaluate the feasibility of projecting treatment effects on pediatric populations based on EHR data and RCTs conducted in adults, we will develop transfer learning methods to predict causal treatment effects for pediatric populations. In particular, we aim to first use observational EHR data to derive a model for predicting how patient characteristics such as age, gender, disease severity measures as well as comorbidities affect the treatment effect. We will use the model to derive a scoring system that assigns patients into different subgroups with potentially different levels of treatment benefit. Then we will develop a robust causal inference procedure to infer about causal treatment effects for each subgroup using EHR data by modeling how covariates affect both the propensity score and the outcome within each subgroup. The same scoring system will be applied to the adult RCT data to estimate the subgroup specific causal treatment effect. The estimated subgroup specific treatment effects from EHR and from RCT will be combined to produce a final estimate of the treatment effect for a target pediatric population with a specific distribution of the baseline characteristics (Zhang et al., 2016; Elze et al., 2017). We will validate the proposed estimate via data integration and transfer learning by assessing the consistency between the projected treatment effect from our method and the treatment effect estimated from the gold standard RCTs for the pediatric studies we request.
The above proposed analysis does not require sharing individual-level data between RCTs and EHR. The model development of the scoring system only uses EHR data and the estimated model coefficients will be uploaded to the platform where RCT data are stored. The subgroup specific treatment effect estimate takes a form of a univariate function that maps a univariate score to an estimated treatment effect. This estimated function, which is a summary-level result, will be derived from RCT and then sent to the EHR data site. Similarly, for the final validation analyses, only estimated functions and model parameters will be transported between platforms.

Narrative Summary: 

The project aims to develop new statistical tools to support evaluations of the efficacy of biologics in pediatric patients and make evidence-generation in pediatric studies more efficient and feasible.

Project Timeline: 

The project is expected to be completed in a year:
1. Months 1-4: data collection
2. Months 5-18: data analysis and method development
3. Months 9-24: Manuscript preparation and publication (2-3 anticipated publications)
We will share all manuscripts generated using YODA data at the time of submission with the YODA project team.

Dissemination Plan: 

Our work will be disseminated through 2-3 scientific publications in statistics and medicine, such as the Journal of the American Statistical Association and JAMA Pediatrics. We will also present the work at national conferences, such as the Joint Statistical Meeting. Statistical software for implementing the proposed methodologies will also be distributed to the research community. In addition, we plan on engaging with a number of stakeholders in pediatric studies throughout the project, including pharmaceutical companies, regulatory agencies, and patient stakeholders to both inform our work and develop work products addressing the needs of these specific groups. For example, in collaboration with the FDA, our methods could contribute to regulatory guidance on how to accelerate drug development for pediatric populations with the proposed more efficient and feasible evidence-generation procedure.


1. McCluggage, L. K. (2011). Safety of TNF inhibitors in adolescents and children. Adolescent health, medicine and therapeutics, 2, 1.
2. Patel, A. S., Suarez, L. D., & Rosh, J. R. (2016). Adalimumab in pediatric Crohn's disease. Immunotherapy, 8(2), 127-133.
3. Rose, K. (2019). Challenges in Pediatric Drug Development. Pediatric Drugs, 11(1), 57-59.
4. Stoll, M. L., & Cron, R. Q. (2014). Treatment of juvenile idiopathic arthritis: a revolution in care. Pediatric rheumatology, 12(1), 13.
5. McMahon, A. W., & Dal Pan, G. (2018). Assessing drug safety in children—the role of real-world data. The New England journal of medicine, 378(23), 2155.
6. X. Wang, L. Parast, L. Tian & T. Cai (2019) Model-Free Approach to Quantifying the Proportion of Treatment Effect Explained by a Surrogate Marker. Biometrika, 2019, accepted.
7. Zhang, Z., Nie, L., Soon, G., & Hu, Z. (2016). New methods for treatment effect calibration, with applications to non‐inferiority trials. Biometrics, 72(1), 20-29.
8. Elze, M. C., Gregson, J., Baber, U., Williamson, E., Sartori, S., Mehran, R., ... & Pocock, S. J. (2017). Comparison of propensity score methods and covariate adjustment: evaluation in 4 cardiovascular studies. Journal of the American College of Cardiology, 69(3), 345-357.

General Information

How did you learn about the YODA Project?: 

Request Clinical Trials

Associated Trial(s): 
What type of data are you looking for?: 
Individual Participant-Level Data, which includes Full CSR and all supporting documentation

Data Request Status

Change the status of this request: