Discriminating quality of hospital care in the United States

Citation:

Sharon-Lise T Normand, Robert E Wolf, and Barbara J McNeil. 2008. “Discriminating quality of hospital care in the United States.” Med Decis Making, 28, 3, Pp. 308-22.

Abstract:

BACKGROUND AND OBJECTIVE: The Centers for Medicare and Medicaid Services (CMS) report quality of care for patients hospitalized with acute myocardial infarction (AMI), congestive heart failure (CHF), and community-acquired pneumonia (CAP) with the intention of rewarding superior performing hospitals. The aim of the study was to compare identification of superior hospitals for providing financial rewards using 2 different scoring systems: a latent score that weights individual clinical performance measures according to how well each discriminated hospital quality and a raw sum score (the system adopted by CMS). METHODS: This observational cohort study used 2761 acute care hospitals in the United States reporting AMI clinical performance measures, 3271 reporting CHF measures, and 3714 hospitals reporting CAP measures. For each clinical condition, the main outcome measures included the average raw sum score, the latent score estimated from an item response theory (IRT) model, and the percentage of false negative superior designations made on the basis of raw sum scores relative to latent scores. RESULTS: The average raw sum score was highest for AMI (88.8%) and lower for CHF (73.1%) and CAP (76.3%). AMI measures were equally nondiscriminating of hospital quality; hospital discharge instruction was most discriminating of CHF quality; pneumococcal vaccination was most discriminating of CAP quality. False negative rates varied 2-fold: AMI (10%), CHF (16%), and CAP (24%). CONCLUSIONS: Neither the AMI raw sum score nor latent score discriminates hospital quality due to ceiling effects. Current methods for aggregating measures result in different hospital superior designations than those based on the latent score. Organizations that financially reward hospitals on the basis of such scores need to assess predictive validity of scores and determine a minimum level of classification accuracy.