Reproducibility, reliability and validity of population-based administrative health data for the assessment of cancer non-related comorbidities

Date Published:



Background Patients with comorbidities do not receive optimal treatment for their cancer, leading to lower cancer survival. Information on individual comorbidities is not straightforward to derive from population-based administrative health datasets. We described the development of a reproducible algorithm to extract the individual Charlson index comorbidities from such data. We illustrated the algorithm with 1,789 laryngeal cancer patients diagnosed in England in 2013. We aimed to clearly set out and advocate the time-related assumptions specified in the algorithm by providing empirical evidence for them. Methods Comorbidities were assessed from hospital records in the ten years preceding cancer diagnosis and internal reliability of the hospital records was checked. Data were right-truncated 6 or 12 months prior to cancer diagnosis to avoid inclusion of potentially cancer-related comorbidities. We tested for collider bias using Cox regression. Results Our administrative data showed weak to moderate internal reliability to identify comorbidities (ICC ranging between 0.1 and 0.6) but a notably high external validity (86.3%). We showed a reverse protective effect of non-cancer related Chronic Obstructive Pulmonary Disease (COPD) when the effect is split into cancer and non-cancer related COPD (Age-adjusted HR: 0.95, 95% CI:0.7–1.28 for non-cancer related comorbidities). Furthermore, we showed that a window of 6 years before diagnosis is an optimal period for the assessment of comorbidities. Conclusion To formulate a robust approach for assessing common comorbidities, it is important that assumptions made are explicitly stated and empirically proven. We provide a transparent and consistent approach useful to researchers looking to assess comorbidities for cancer patients using administrative health data.