Postapproval drug safety studies often use propensity scores (PSs) to adjust for a large number of baseline confounders. These studies may involve examining whether treatment safety varies across subgroups. There are many ways a PS could be used to adjust for confounding in subgroup analyses. These methods have trade-offs that are not well understood. We conducted a plasmode simulation to compare relative performance of 5 methods involving PS matching for subgroup analysis, including methods frequently used in applied literature whose performance has not been previously directly compared. These methods varied as to whether the overall PS, subgroup-specific PS, or no rematching was used in subgroup analysis as well as whether subgroups were fully nested within the main analytical cohort. The evaluated PS subgroup matching methods performed similarly in terms of balance, bias, and precision in 12 simulated scenarios varying size of the cohort, prevalence of exposure and outcome, strength of relationships between baseline covariates and exposure, the true effect within subgroups, and the degree of confounding within subgroups. Each had strengths and limitations with respect to other performance metrics that could inform choice of method.