Workplace-Based Entrustment Scales for the Core EPAs: A Multisite Comparison of Validity Evidence for Two Proposed Instruments Using Structured Vignettes and Trained Raters

Citation:

Michael S Ryan, Asra R Khan, Yoon Soo Park, Cody Chastain, Carrie Phillipi, Sally A Santen, Beth A Barron, Vivian Obeso, and Sandra L Yingling. 2021. “Workplace-Based Entrustment Scales for the Core EPAs: A Multisite Comparison of Validity Evidence for Two Proposed Instruments Using Structured Vignettes and Trained Raters.” Acad Med.

Abstract:

PURPOSE: In undergraduate medical education (UME), competency-based medical education has been operationalized through the thirteen Core Entrustable Professional Activities for Entering Residency (Core EPAs). Direct observation in the workplace using rigorous, valid, reliable measures is required to inform summative decisions about graduates' readiness for residency. The purpose of this study is to investigate the validity evidence of two proposed workplace-based entrustment scales. METHOD: The authors of this multisite, randomized, experimental study used structured vignettes and experienced raters to examine validity evidence of the Ottawa scale and the UME supervisory tool (Chen scale) in 2019. The authors used a series of 8 cases (6 developed de novo) depicting learners at pre-entrustable (less-developed) and entrustable (more-developed) skill levels across 5 Core EPAs. Participants from Core EPA pilot institutions rated learner performance using either the Ottawa or Chen scale. The authors used descriptive statistics and analysis of variance to examine data trends and compare ratings, conducted inter-rater reliability and generalizability studies to evaluate consistency among participants, and performed a content analysis of narrative comments. RESULTS: Fifty clinician-educators from 10 institutions participated, yielding 579 discrete EPA assessments. Both the Ottawa and Chen scales differentiated between less- and more-developed skill levels (P < .001). The interclass correlation was good to excellent for all EPAs using Ottawa (range = .68-.91) and fair to excellent using Chen (range = .54-.83). Generalizability analysis revealed substantial variance in ratings attributable to the learner-EPA interaction (59.6% for Ottawa; 48.9% for Chen) suggesting variability for ratings was appropriately associated with performance on individual EPAs. CONCLUSIONS: In a structured setting, both the Ottawa and Chen scale distinguished between pre-entrustable and entrustable learners; however, the Ottawa scale demonstrated more desirable characteristics. These findings represent a critical step forward in developing valid, reliable instruments to measure learner progression toward entrustment for the Core EPAs.