Health information growth has created unprecedented opportunities to evaluate the effectiveness of therapies in large and broadly representative patient populations. Such evaluations are critical to understanding whether newly approved treatments, found to be efficacious in clinical trials, are safe and effective when applied in routine care. Extracting sound evidence from large observational data is now at the forefront of health care policy decisions. For instance, movement away from a strict biomedical perspective to one that is wider for coverage of new medical technologies has resulted in increasing use of coverage with evidence development, a mechanism that links financial support for newly approved medical technologies to evidence regarding those most likely to benefit. Statistical methods for the estimation of causal effects in large observational data are numerous, but substantial analytical challenges remain, especially when the number of patient-level covariates is very large (p). First, large administrative data now provide novel opportunities to study treatment effect heterogeneity, however, statistical methods for estimating average causal effects in sub-populations are scarce and underdeveloped in observational data (Aims 1 and 2). Second, rigorous methods for estimating causal effects rely on choosing the right approach for confounding adjustment, a daunting and unresolved task when dealing with a very large p and treatment effect heterogeneity (Aim 1). Third, selective inference – the process of drawing inference on a subset of parameters that is selected because the parameters seem interesting after viewing the data – is common with big data and can hamper the replicability of findings (Aim 1). Fourth, instrumental variables methods often assume constant treatment effects, which is unlikely to hold in many health services applications, and use of a single causal parameter when unmeasured confounders moderate treatment effects is limiting (Aim 2). We have at least three goals: a) development of new statistical methods that are scalable to very large p and will overcome the limitations listed above (Aims 1 and 2); b) creation of new knowledge in comparative effectiveness research (CER) for cardiovascular disease and cancer (Aim 3); and c) dissemination of methods with open source software and reproducible CER analyses (Aim 4). New methods will be validated with theoretical arguments and simulation studies. Our ongoing interdisciplinary collaborations, which provide a wealth of clinical knowledge and access to contemporary clinical registry, administrative, and clinical trial data, supply the substantive questions that motivate our methods development. As part of this proposal, we will also address pressing CER questions (Aim 3), providing tools so others can replicate findings (Aim 4). The Specific Aims are:
Aim 1: To develop new Bayesian methods for causal inference in large observational data to: 1) estimate average causal effects accounting for model uncertainty in the specification of measured confounders; and 2) characterize heterogeneous treatment effects by estimating subgroup-specific causal effects while accounting for uncertainty in the subgroup identification.The new approaches will not condition on a single model, but rather average across multiple model specifications according to empirical support for confounding adjustment and existence of heterogeneous effects.
Aim 2: To develop new Bayesian methods for assessing treatment effects in the presence of unmeasured confounders that moderate treatment effects. Using instrumental variables, we will 1) identify the distributions of essential causal parameters (e.g., average, local average, and marginal treatment effects) and 2) link these parameters to population subgroups by systematically relaxing selection bias assumptions. Our methods will provide a unified framework for estimation of key causal parameters and for empirically assessing assumptions required for inference.
Aim 3: To apply the new methods to three observational studies and to randomized trial data to provide new and fully reproducible knowledge in the areas of new medical devices, surgical procedures, and pharmaceutical treatments in cardiovascular disease and cancer.
Aim 4: To develop flexible, efficient, robust, well documented, user-friendly R libraries and SAS macros so newly developed methods can be disseminated and CER findings can be reproduced by others.
Our new methods, their applications to large administrative and clinical registry data, and their dissemination will allow the entire research community to address modern CER questions with the highest methodological rigor. These methods will permit routine identification of population subgroups that are most likely to benefit from interventions, and therefore will have a direct and sustained impact on policy making and clinical practice.
- Adhikari S, Normand S-L T. Non-parametric Bayesian Instrumental Variable Analysis: Comparing effectiveness of radial vs femoral arteries access Percutaneous Coronary Intervention (PCI) in reducing bleeding and vascular complications. EMR-IBS and Italian Region Conference. May 2017.