Evaluating the use of bootstrapping in cohort studies conducted with 1:1 propensity score matching-A plasmode simulation study

Citation:

Desai RJ, Wyss R, Abdia Y, Toh S, Johnson M, Lee H, Karami S, Major JM, Nguyen M, Wang SV, et al. Evaluating the use of bootstrapping in cohort studies conducted with 1:1 propensity score matching-A plasmode simulation study. Pharmacoepidemiol Drug Saf. 2019;28 (6) :879-886.

Date Published:

2019 Jun

Abstract:

PURPOSE: Bootstrapping can account for uncertainty in propensity score (PS) estimation and matching processes in 1:1 PS-matched cohort studies. While theory suggests that the classical bootstrap can fail to produce proper coverage, practical impact of this theoretical limitation in settings typical to pharmacoepidemiology is not well studied. METHODS: In a plasmode-based simulation study, we compared performance of the standard parametric approach, which ignores uncertainty in PS estimation and matching, with two bootstrapping methods. The first method only accounted for uncertainty introduced during the matching process (the observation resampling approach). The second method accounted for uncertainty introduced during both PS estimation and matching processes (the PS reestimation approach). Variance was estimated based on percentile and empirical standard errors, and treatment effect estimation was based on median and mean of the estimated treatment effects across 1000 bootstrap resamples. Two treatment prevalence scenarios (5% and 29%) across two treatment effect scenarios (hazard ratio of 1.0 and 2.0) were evaluated in 500 simulated cohorts of 10 000 patients each. RESULTS: We observed that 95% confidence intervals from the bootstrapping approaches but not the standard approach, resulted in inaccurate coverage rates (98%-100% for the observation resampling approach, 99%-100% for the PS reestimation approach, and 95%-96% for standard approach). Treatment effect estimation based on bootstrapping approaches resulted in lower bias than the standard approach (less than 1.4% vs 4.1%) at 5% treatment prevalence; however, the performance was equivalent at 29% treatment prevalence. CONCLUSION: Use of bootstrapping led to variance overestimation and inconsistent coverage, while coverage remained more consistent with parametric estimation.