Publications by Year: 2009

2009
J. Chhatwal, O. Alagoz, M. J. Lindstrom, Jr. Kahn, C. E., K. A. Shaffer, and E.S. Burnside. 2009. “A logistic regression model based on the national mammography database format to aid breast cancer diagnosis.” AJR Am J Roentgenol, 192, Pp. 1117-27.Abstract
OBJECTIVE: The purpose of our study was to create a breast cancer risk estimation model based on the descriptors of the National Mammography Database using logistic regression that can aid in decision making for the early detection of breast cancer. MATERIALS AND METHODS: We created two logistic regression models based on the mammography features and demographic data for 62,219 consecutive mammography records from 48,744 studies in 18,269 [corrected] patients reported using the Breast Imaging Reporting and Data System (BI-RADS) lexicon and the National Mammography Database format between April 5, 1999 and February 9, 2004. State cancer registry outcomes matched with our data served as the reference standard. The probability of cancer was the outcome in both models. Model 2 was built using all variables in Model 1 plus radiologists' BI-RADS assessment categories. We used 10-fold cross-validation to train and test the model and to calculate the area under the receiver operating characteristic curves (A(z)) to measure the performance. Both models were compared with the radiologists' BI-RADS assessments. RESULTS: Radiologists achieved an A(z) value of 0.939 +/- 0.011. The A(z) was 0.927 +/- 0.015 for Model 1 and 0.963 +/- 0.009 for Model 2. At 90% specificity, the sensitivity of Model 2 (90%) was significantly better (p < 0.001) than that of radiologists (82%) and Model 1 (83%). At 85% sensitivity, the specificity of Model 2 (96%) was significantly better (p < 0.001) than that of radiologists (88%) and Model 1 (87%). CONCLUSION: Our logistic regression model can effectively discriminate between benign and malignant breast disease and can identify the most important features associated with breast cancer.
E.S. Burnside, J. Davis, J. Chhatwal, O. Alagoz, M. J. Lindstrom, B. M. Geller, B. Littenberg, K. A. Shaffer, Jr. Kahn, C. E., and C. D. Page. 2009. “Probabilistic computer model developed from clinical data in national mammography database format to classify mammographic findings.” Radiology, 251, Pp. 663-72.Abstract
{PURPOSE: To determine whether a Bayesian network trained on a large database of patient demographic risk factors and radiologist-observed findings from consecutive clinical mammography examinations can exceed radiologist performance in the classification of mammographic findings as benign or malignant. MATERIALS AND METHODS: The institutional review board exempted this HIPAA-compliant retrospective study from requiring informed consent. Structured reports from 48 744 consecutive pooled screening and diagnostic mammography examinations in 18 269 patients from April 5, 1999 to February 9, 2004 were collected. Mammographic findings were matched with a state cancer registry, which served as the reference standard. By using 10-fold cross validation, the Bayesian network was tested and trained to estimate breast cancer risk by using demographic risk factors (age, family and personal history of breast cancer, and use of hormone replacement therapy) and mammographic findings recorded in the Breast Imaging Reporting and Data System lexicon. The performance of radiologists compared with the Bayesian network was evaluated by using area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. RESULTS: The Bayesian network significantly exceeded the performance of interpreting radiologists in terms of AUC (0.960 vs 0.939