Publications

2020
Luque-Fernandez MA, Redondo-Sanchez D, Lee SF, Rodriguez-Barranco M, Carmona-Garcia M, Marcos-Gragera R, Sanchez MJ. Multimorbidity by Patient and Tumor Factors and Time-to-Surgery Among Colorectal Cancer Patients in Spain : A Population-Based Study. Clinical Epidemiology 2020;2020:12:31-40.Abstract

Background: Cancer treatment and outcomes can be influenced by tumor characteristics, patient overall health status, and comorbidities. While previous studies have analyzed the influence of comorbidity on cancer outcomes, limited information is available regarding factors associated with the increased prevalence of comorbidities and multimorbidity among patients with colorectal cancer in Spain.

Patients and Methods: This cross-sectional study obtained data from all colorectal cancer cases diagnosed in two Spanish provinces in 2011 from two population-based cancer registries and electronic health records. We calculated the prevalence of comorbidities according to patient and tumor factors, identified factors associated with an increased prevalence of comorbidity and multimorbidity, analyzed the association between comorbidities and time-to-surgery, and developed an interactive web application (https://comcor.netlify.com/).

Results: The most common comorbidities were diabetes (23.6%), chronic obstructive pulmonary disease (17.2%), and congestive heart failure (14.5%). Among all comorbidities, 52% of patients were diagnosed at more advanced stages (stage III/IV). Patients with advanced age, restricted performance status or who were disabled, obese, and smokers had a higher prevalence of multimorbidity. Patients with multimorbidity had a longer time-to-surgery than those without comorbidity (17 days, 95% confidence interval: 3– 29 days).

Conclusion: We identified a consistent pattern of factors associated with a higher prevalence of comorbidities and multimorbidity at diagnosis and an increased time-to-surgery among patients with colorectal cancer with multimorbidity in Spain. This pattern may provide insights for further etiological and preventive research and help to identify patients at a higher risk for poorer cancer outcomes and suboptimal treatment.

2019
León-Gómez BB, Gotsens M, Mari-Dell'Olmo M, Domınguez-Berjón MF, Luque-Fernandez MÁ, Martin U, Rodrıguez-Sanz M, Pérez G. Bayesian smoothed small-areas analysis of urban inequalities in fertility across 1999–2013 [Internet]. Fertility Research and Practice 2019;5(1):17. Publisher's VersionAbstract
Since the 2008 economic crisis in Spain, overall fertility has continued to decrease, while urban inequalities have increased. There is a general lack of studies of fertility patterns in small-areas of Spanish cities. We explored the effects of the economic crisis on fertility during three time periods in urban settings in Spain.
Bellizzi S, Nivoli A, Salaris P, Ronzoni AR, Pichierri G, Palestra F, Wazwaz O, Luque-Fernandez MA. Sexual violence and eclampsia: analysis of data from Demographic and Health Surveys from seven low- and middle-income countries. Journal of Global Health 2019;9(2):020434.Abstract
BACKGROUND: Scientific literature has provided clear evidence of the profound impact of sexual violence on women's health, such as somatic disorders and mental adverse outcomes. However, consequences related to obstetric complications are not yet completely clarified. This study aimed to assess the association of lifetime exposure to intimate partner sexual violence with eclampsia. METHODS: We considered all the seven Demographic and Health Surveys (DHS) that included data on sexual violence and on signs and symptoms suggestive of eclampsia for women of reproductive age (15-49 years). We computed unadjusted and adjusted odds ratios (OR) to evaluate the risk of suggestive eclampsia by ever subjected to sexual violence. A sensitivity analysis was conducted restricting the study population to women who had their last live birth over the 12 months before the interview. RESULTS: Self-reported experience of sexual violence ranged from 3.7% in Mali to 9.2% in India while prevalence of women reporting signs and symptoms compatible with eclampsia ranged from 14.3% in Afghanistan to 0.7% in the Philippines. Reported sexual violence was associated with a 2-fold increased odd of signs and symptoms suggestive of eclampsia in the pooled analysis. The sensitivity analysis confirmed the strength of the association between sexual violence and eclampsia in Afghanistan and in India. CONCLUSIONS: Women and girls in low-and-middle-income countries are at high risk of sexual violence, which may represent a risk factor for hypertensive obstetric complication. Accurate counseling by health care providers during antenatal care consultations may represent an important opportunity to prevent adverse outcomes during pregnancy.
Hao M, Luque-Fernandez MA, Lopez D, Cote K, Newfield J, Connors M, Vaidya A. Benign Adrenocortical Tumors and the Detection of Nonadrenal Neoplasia [Internet]. International Journal of Endocrinology 2019;2019:9. Publisher's VersionAbstract
Context. Patients with adrenocortical tumors have been frequently observed to have nonadrenal neoplasia. Objective. To investigate whether patients with benign adrenocortical tumors have a higher likelihood of having nonadrenal neoplasia detected. Design and Participants. Case-control study of patients with benign adrenocortical tumors (cases; n = 400) and normal adrenal glands (controls; n = 400), who underwent repeated abdominal cross-sectional imaging. Main Outcomes. Primary analyses: association between case-control status and benign abdominal neoplasia detected via cross-sectional imaging. Secondary analyses: association between case-control status and tumors detected via other imaging modalities. Results. The mean interval of abdominal imaging was 4.7 (SD = 3.8) years for cases and 5.9 (4.8) years for controls. Cases were more likely to have detected intraductal papillary mucinous neoplasms (IPMNs) of the pancreas (8.5% vs. 4.5%, adjusted OR = 2.22, 95% CI (1.11, 4.63)) compared with controls. In secondary analyses, cases were more likely to have detected thyroid nodules (25.5% vs. 17.0%, adjusted OR = 1.77, 95% CI (1.15, 2.74)), hyperparathyroidism or parathyroid adenomas (3.5% vs. 1.3%, adjusted OR = 3.00, 95% CI (1.00, 11.64)), benign breast masses (6.0% vs. 3.3%, adjusted OR = 3.25, 95% CI (1.28, 8.78)), and benign prostatic hyperplasia (20.5% vs. 5.3%, adjusted OR = 3.20, 95% CI (1.14, 10.60)). Using a composite outcome, cases had higher odds of detection of the composite of IPMN, thyroid nodules, parathyroid tumors, benign breast masses, and prostate hyperplasia (adjusted OR = 2.36, 95% CI: 1.60, 3.50) when compared with controls. Conclusions. Patients with benign adrenocortical tumors had higher odds of detected pancreatic IPMN, as well as thyroid nodules, parathyroid tumors, benign breast masses, and prostate hyperplasia compared with patients with normal adrenal glands. These associations may have important implications for patient care and healthcare economics, regardless of whether they reflect incidental discoveries due to imaging detection or frequency bias, or a common risk for developing multiple neoplasia.
Puig-Barrachina V, Rodríguez-Sanz M, Domínguez-Berjón MF, Martín U, Luque MÁ, Ruiz M, Perez G. Decline in fertility induced by economic recession in Spain [Internet]. Gaceta Sanitaria 2019; Publisher's VersionAbstract
Objective To describe trends in fertility in Spain before (pre-recession; 1998-2008) and during (recession period; 2009-2013) the economic crisis of 2008, taking into account women's age and regional unemployment in 2010. Method The study consisted of a panel design including cross-sectional ecological data for the 17 regions of Spain. We describe fertility trends in Spain in two time periods, pre-recession (1998-2008) and recession (2009-2013). We used a cross-sectional, ecological study of Spanish-born women to calculate changes in fertility rates for each period using a linear regression model adjusted for year, period, and interaction between them. Results We found that compared to the pre-recession period, the fertility rate in Spain generally decreased during the economic recession. However, in some regions, such as the Canary Islands, this decrease began before the onset of the recession, while in other regions, such as the Basque country, the fertility rate continued to grow until 2011. The effects of the recession on the fertility rate are clearly observed in women aged 30-34 years. Conclusions The current economic recession has disrupted the positive trend in fertility that began at the start of this century. Since Spain already had very low fertility rates, the further decline caused by the economic recession could jeopardize the sustainability of welfare-state systems. Resumen Objetivo Describir las tendencias de la fecundidad en España en la época precrisis (1998-2008) y durante la crisis (2009-2013) económica, teniendo en cuenta la edad de las mujeres y el desempleo regional en 2010. Método Se utiliza un diseño panel que incluye datos ecológicos transversales para las 17 comunidades autónomas de España. Se describen las tendencias de fecundidad en los dos periodos. Para calcular los cambios en las tasas de fecundidad se utiliza un modelo de regresión lineal ajustado por año, periodo e interacción de ellas. Resultados En comparación con el periodo anterior, la tasa de fecundidad global en España disminuyó durante la crisis económica. Sin embargo, en algunas comunidades, como las Islas Canarias, esta disminución comenzó antes del inicio de la crisis, mientras que en otras, como el País Vasco, la tasa de fecundidad continuó creciendo hasta 2011. Los efectos de la crisis en la fecundidad se observan claramente en mujeres de 30 a 34 años. Conclusiones La crisis económica actual ha interrumpido la tendencia positiva en la fecundidad que comenzó a principios de este siglo. Dado que España ya tenía tasas de fecundidad muy bajas, el descenso causado por la crisis económica podría poner en peligro la sostenibilidad de los sistemas de bienestar social.
Luque-Fernandez MA, Redondo-Sánchez D, Maringe C. cvauroc: Command to compute cross-validated area under the curve for ROC analysis after predictive modeling for binary outcomes [Internet]. The Stata Journal 2019;19(3):615-625. Publisher's VersionAbstract
Receiver operating characteristic (ROC) analysis is used for comparing predictive models in both model selection and model evaluation. ROC analysis is often applied in clinical medicine and social science to assess the tradeoff between model sensitivity and specificity. After fitting a binary logistic or probit regression model with a set of independent variables, the predictive performance of this set of variables can be assessed by the area under the curve (AUC) from an ROC curve. An important aspect of predictive modeling (regardless of model type) is the ability of a model to generalize to new cases. Evaluating the predictive performance (AUC) of a set of independent variables using all cases from the original analysis sample often results in an overly optimistic estimate of predictive performance. One can use K-fold cross-validation to generate a more realistic estimate of predictive performance in situations with a small number of observations. AUC is estimated iteratively for k samples (the “test” samples) that are independent of the sample used to predict the dependent variable (the “training” sample). cvauroc implements k-fold cross-validation for the AUC for a binary outcome after fitting a logit or probit regression model, averaging the AUCs corresponding to each fold, and bootstrapping the cross-validated AUC to obtain statistical inference and 95% confidence intervals. Furthermore, cvauroc optionally provides the cross-validated fitted probabilities for the dependent variable or outcome, contained in a new variable named \_fit; the sensitivity and specificity for each of the levels of the predicted outcome, contained in two new variables named \_sen and \_spe; and the plot of the mean cross-validated AUC and k-fold ROC curves.
Schomaker M, Luque-Fernandez MA, Leroy V, Davies MA. Using longitudinal targeted maximum likelihood estimation in complex settings with dynamic interventions [Internet]. Statistics in Medicine 2019; Publisher's VersionAbstract
Longitudinal targeted maximum likelihood estimation (LTMLE) has very rarely been used to estimate dynamic treatment effects in the context of time-dependent confounding affected by prior treatment when faced with long follow-up times, multiple time-varying confounders, and complex associational relationships simultaneously. Reasons for this include the potential computational burden, technical challenges, restricted modeling options for long follow-up times, and limited practical guidance in the literature. However, LTMLE has desirable asymptotic properties, ie, it is doubly robust, and can yield valid inference when used in conjunction with machine learning. It also has the advantage of easy-to-calculate analytic standard errors in contrast to the g-formula, which requires bootstrapping. We use a topical and sophisticated question from HIV treatment research to show that LTMLE can be used successfully in complex realistic settings, and we compare results to competing estimators. Our example illustrates the following practical challenges common to many epidemiological studies: (1) long follow-up time (30 months); (2) gradually declining sample size; (3) limited support for some intervention rules of interest; (4) a high-dimensional set of potential adjustment variables, increasing both the need and the challenge of integrating appropriate machine learning methods; and (5) consideration of collider bias. Our analyses, as well as simulations, shed new light on the application of LTMLE in complex and realistic settings: We show that (1) LTMLE can yield stable and good estimates, even when confronted with small samples and limited modeling options; (2) machine learning utilized with a small set of simple learners (if more complex ones cannot be fitted) can outperform a single, complex model, which is tailored to incorporate prior clinical knowledge; and (3) performance can vary considerably depending on interventions and their support in the data, and therefore critical quality checks should accompany every LTMLE analysis. We provide guidance for the practical application of LTMLE.
Luque-Fernandez MA, Thomas A, Gelaye B, Racape J, Sanchez MJ, Williams MA. Secular trends in stillbirth by maternal socioeconomic status in Spain 2007–15: a population-based study of 4 million births [Internet]. European Journal of Public Health 2019; Publisher's VersionAbstract

Stillbirth, one of the urgent concerns of preventable perinatal deaths, has wide-reaching consequences for society. We studied secular stillbirth trends by maternal socioeconomic status (SES) in Spain.We developed a population-based observational study, including 4 083 919 births during 2007–15. We estimate stillbirth rates and secular trends by maternal SES. We also evaluated the joint effect of maternal educational attainment and the Human Development Index (HDI) of women’s country of origin on the risk of stillbirth. The data and statistical analysis can be accessed for reproducibility in a GitHub repository: https://github.com/migariane/Stillbirth

We found a consistent pattern of socioeconomic inequalities in the risk of delivering a stillborn, mainly characterized by a persistently higher risk, over time, among women with lower SES. Overall, women from countries with low HDIs and low educational attainments had approximately a four times higher risk of stillbirth (RR: 4.44; 95%CI: 3.71–5.32). Furthermore, we found a paradoxical reduction of the stillbirth gap over time between the highest and the lowest SESs, which is mostly due to the significant and increasing trend of stillbirth risk among highly educated women of advanced maternal age.

Our findings highlight no improvement in stillbirth rates among women of lower SES and an increasing trend among highly educated women of advanced maternal age over recent years. Public health policies developing preventive programmes to reduce stillbirth rates among women with lower SES are needed as well as the necessity of further study to understand the growing trend of age-related stillbirths among highly educated women in Spain.

Lee SF, Luque-Fernandez MA. Is cancer-related death associated with circadian rhythm? [Internet]. Cancer Communications 2019;39(1):27. Publisher's VersionAbstract

We investigated the temporal pattern of death in cancer patients using a large sample size and robust statistical methods to account for chronobiological periodicity.

We did not detect a circadian pattern of cancer death. The present study evaluated the temporal pattern of death among cancer patients using trigonometric functions and with time modeled in a circular scale.

To conclude, we found no evidence of a chronobiological circadian pattern in death among cancer patients by using robust statistical methods and data from a large population in a hospital setting. Increased understanding of the temporal pattern of deaths may yield important insights toward understanding external factors associated with cancer death.

Luque-Fernández MÁ, Calduch EN. Education in public health, epidemiology and biostatistics in Spain from a global and comparative perspective [Internet]. Gaceta Sanitaria 2019; Publisher's VersionAbstract
In the last ten years, there have been intense debates to boost the discipline, make it relevant to the genomic revolution, and place it at the forefront of the digital era. As a result, training in public health and epidemiology has been renewed, with marked statistical and methodological reinforcement, such as the current emphasis on causal inference, or the inclusion of master's in data science for health as a new academic degree in some schools of public health https://www.hsph.harvard.edu/health-data-science/.
Luque-Fernandez MA, Redondo-Sanchez D, Schomaker M. Effect Modification and Collapsibility in Evaluations of Public Health Interventions [Internet]. American Journal of Public Health 2019;109(3):e12-e13. Publisher's VersionAbstract

The Evaluating Public Health Interventions AJPH series offers excellent practical guidance to public health researchers. The eighth part of the series provides a valuable introduction to effect estimations of time-invariant public health interventions. In their commentary Spiegelman and Zhou suggest that, in terms of bias and efficiency, there is no advantage to using modern causal inference methods over classical multivariable modeling. However, this statement is not always true. Most important, both effect modification and collapsibility are critical concepts when assessing the validity of using regression for causal effect estimation.

Suppose that one is interested in the effect of combined radiotherapy and chemotherapy versus chemotherapy only on one-year mortality among patients diagnosed with colorectal cancer. A clinician may ask: how different would the risk of death have been had everyone received dual therapy as compared with if everyone had experienced monotherapy? The causal marginal odds ratio (MOR) offers an answer to this question. Each individual has a pair of potential outcomes: the outcome he or she would have experienced had he or she been exposed to dual treatment (A = 1), denoted Y(1), and the outcome had he or she been unexposed, Y(0). The MOR is defined as

[P(Y(1) = 1)/(1 − P(Y(1) = 1))]/[P(Y(0) = 1)/(1 − P(Y(0) = 1))].

A common approach would be to use logistic regression to model the odds of mortality given the intervention and adjust for confounders (W) such as clinical stage and comorbidities. Note that this regression will provide an estimate of the conditional odds ratio (COR), which is

[P(Y = 1   A = 1,W) / (1 − P(Y = 1   A = 1,W))] / [P(Y = 1   A = 0,W) / (1 − P(Y = 1   A = 0,W))].

The MOR and COR are typically not identical. First, if there is effect modification (e.g., if the effect of dual therapy is different between patients with no comorbidities and those who have hypertension), logistic regression including an interaction term will not provide a marginal effect estimate but only the conditional effect of the interaction term between dual therapy and hypertension. Second, the odds ratio is noncollapsible, which means that the MOR is not necessarily equal to the stratum-specific odds ratio (i.e., the COR). This holds even when a covariate is related to the outcome but not the intervention and is thus not a confounder.

Extended note: Monte Carlo simulations and  code supporting the letter can be found at https://github.com/migariane/hetmor

Luque-Fernandez MA, Redondo-Sanchez D, RODRIGUEZ-BARRANCO MIGUEL, Carmona Garcia MC, Marcos-Gragera R, Sanchez MJ. The pattern of Comorbidities and Associated Risk Factors among Colorectal Cancer Patients in Spain: CoMCoR study [Internet]. bioRxiv 2019; Publisher's VersionAbstract
Colorectal cancer is the second most frequently diagnosed cancer in Spain. Cancer treatment and outcomes can be influenced by tumor characteristics, patient general health status and comorbidities. Numerous studies have analyzed the influence of comorbidity on cancer outcomes, but limited information is available regarding the frequency and distribution of comorbidities in colorectal cancer patients, particularly elderly ones, in the Spanish population. We developed a population-based study of all incident colorectal cases diagnosed in Spain in 2011 to describe the frequency and distribution of comorbidities, as well as tumor and healthcare factors. Data were obtained from two population-based cancer registries and the complete version of patients' digitalized clinical records history. We then characterized the most prevalent comorbidities, as well as dementia and multimorbidity, and developed an interactive web application to visualize our findings (http://watzilei.com/shiny/CoMCoR/). The most common comorbidities were diabetes (23.6%), chronic obstructive pulmonary disease (17.2%), and congestive heart failure (14.5%). Dementia was the most common comorbidity among patients aged >=75 years. Patients with dementia had a 30% higher prevalence of being diagnosed at stage IV and the highest prevalence of emergency hospital admission after colorectal cancer diagnosis (33%). Colorectal cancer patients with dementia were nearly three times more likely to not be offered surgical treatment. Age >=75 years, obesity, male sex, being a current smoker, having surgery more than 60 days after cancer diagnosis, and not receiving surgical treatment were associated with a higher risk of multimorbidity. Patients with multimorbidity aged >=75 years showed a higher prevalence of hospital emergency admission followed by surgery the same day of the admission (37%). We found a consistent pattern in the distribution and frequency of comorbidities and multimorbidity among colorectal cancer patients. The high frequency of stage IV diagnosis among patients with dementia and the high proportion of older patients not receiving surgical treatment are significant findings that require policy actions.
Fernandez MAL, Belot A, Ndiaye A, Luque-Fernandez M-A, Kipourou D-K, Maringe C, Rubio FJ, Rachet B. Summarizing and communicating on survival data according to the audience: a tutorial on different measures illustrated with population-based cancer registry data [Internet]. Clinical Epidemiology 2019; Publisher's VersionAbstract
Survival data analysis results are usually communicated through the overall survival probability. Alternative measures provide additional insights and may help in communicating the results to a wider audience. We describe these alternative measures in two data settings, the overall survival setting and the relative survival setting, the latter corresponding to the particular competing risk setting in which the cause of death is unavailable or unreliable. In the overall survival setting, we describe the overall survival probability, the conditional survival probability and the restricted mean survival time (restricted to a prespecified time window). In the relative survival setting, we describe the net survival probability, the conditional net survival probability, the restricted mean net survival time, the crude probability of death due to each cause and the number of life years lost due to each cause over a prespecified time window. These measures describe survival data either on a probability scale or on a timescale. The clinical or population health purpose of each measure is detailed, and their advantages and drawbacks are discussed. We then illustrate their use analyzing England population-based registry data of men 15–80 years old diagnosed with colon cancer in 2001–2003, aiming to describe the deprivation disparities in survival. We believe that both the provision of a detailed example of the interpretation of each measure and the software implementation will help in generalizing their use.
clep-173523-summarising-and-communicating-on-survival-data-according-to-010219.pdf
Luque-Fernandez MA, Schomaker M, Redondo-Sanchez D, Perez MJS, Vaidya A, Schnitzer ME. Educational Note: Paradoxical collider effect in the analysis of non-communicable disease epidemiological data: a reproducible illustration and web application [Internet]. International Journal of Epidemiology 2019;:dyy275. Publisher's VersionAbstract
Classical epidemiology has focused on the control of confounding, but it is only recently that epidemiologists have started to focus on the bias produced by colliders. A collider for a certain pair of variables (e.g. an outcome Y and an exposure A) is a third variable (C) that is caused by both. In a directed acyclic graph (DAG), a collider is the variable in the middle of an inverted fork (i.e. the variable C in A → C ← Y). Controlling for, or conditioning an analysis on a collider (i.e. through stratification or regression) can introduce a spurious association between its causes. This potentially explains many paradoxical findings in the medical literature, where established risk factors for a particular outcome appear protective. We use an example from non-communicable disease epidemiology to contextualize and explain the effect of conditioning on a collider. We generate a dataset with 1000 observations, and run Monte-Carlo simulations to estimate the effect of 24-h dietary sodium intake on systolic blood pressure, controlling for age, which acts as a confounder, and 24-h urinary protein excretion, which acts as a collider. We illustrate how adding a collider to a regression model introduces bias. Thus, to prevent paradoxical associations, epidemiologists estimating causal effects should be wary of conditioning on colliders. We provide R code in easy-to-read boxes throughout the manuscript, and a GitHub repository [https://github.com/migariane/ColliderApp] for the reader to reproduce our example. We also provide an educational web application allowing real-time interaction to visualize the paradoxical effect of conditioning on a collider [http://watzilei.com/shiny/collider/].
LuqueFernandez_Collider_IJE_2018.pdf
2018
Perez G, Gotsens M, Cevallos-Garcia C, Dominguez-Berjon MF, Diez E, Bacigalupe A, Palencia L, Leon-Gomez BB, Luque-Fernandez MA, Mari-DellOlmo M, Martin U, Puig-Barrachina V, Rodriguez-Sanz M, Ruiz M. The impact of the economic recession on inequalities in induced abortion in the main cities of Spain. European Journal of Public Health 2018;Abstract
The aim of this study was to analyse the trends in socioeconomic inequalities in induced abortion during the pre-crisis and crisis periods in the postcodes of two major cities of Spain. Ecological regression model showed that rates of induced abortion tended to increase between the two pre-crisis periods, but remained stable between the second pre-crisis period and the crisis period. In addition, we observed socioeconomic inequalities in induced abortion in both cities and in all age groups, and that these inequalities persisted across the three study periods.
Luque-Fernandez MA, Schomaker M, Redondo-Sanchez D, Perez MJS, Vaidya A, Schnitzer ME. Educational Note: Paradoxical Collider Effect in the Analysis of Non-Communicable Disease Epidemiological Data: a reproducible illustration and web application. arXiv preprint arXiv:1809.07111 2018;Abstract
Classical epidemiology has focused on the control of confounding but it is only recently that epidemiologists have started to focus on the bias produced by colliders. A collider for a certain pair of variables (e.g., an outcome Y and an exposure A) is a third variable (C) that is caused by both. In a directed acyclic graph (DAG), a collider is the variable in the middle of an inverted fork (i.e., the variable C in A -> C <- Y). Controlling for, or conditioning an analysis on a collider (i.e., through stratification or regression) can introduce a spurious association between its causes. This potentially explains many paradoxical findings in the medical literature, where established risk factors for a particular outcome appear protective. We use an example from non-communicable disease epidemiology to contextualize and explain the effect of conditioning on a collider. We generate a dataset with 1,000 observations and run Monte-Carlo simulations to estimate the effect of 24-hour dietary sodium intake on systolic blood pressure, controlling for age, which acts as a confounder, and 24-hour urinary protein excretion, which acts as a collider. We illustrate how adding a collider to a regression model introduces bias. Thus, to prevent paradoxical associations, epidemiologists estimating causal effects should be wary of conditioning on colliders. We provide R-code in easy-to-read boxes throughout the manuscript and a GitHub repository (this https URL) for the reader to reproduce our example. We also provide an educational web application allowing real-time interaction to visualize the paradoxical effect of conditioning on a collider this http URL
Rodríguez-Barranco M, Salamanca-Fernández E, Fajardo ML, Bayo E, Chang-Chan Y-L, Expósito J, García C, Tallón J, Minicozzi P, Sant M, Petrova D, Sánchez M-J, Luque-Fernandez MA. Patient, tumor, and healthcare factors associated with regional variability in lung cancer survival: a Spanish high-resolution population-based study [Internet]. Clinical and Translational Oncology 2018; Publisher's VersionAbstract

PURPOSE:

The third most frequently diagnosed cancer in Europe in 2018 was lung cancer; it is also the leading cause of cancer death in Europe. We studied patient and tumor characteristics, and patterns of healthcare provision explaining regional variability in lung cancer survival in southern Spain.

METHODS:

A population-based cohort study included all 1196 incident first invasive primary lung cancer (C33-C34 according to ICD-10) cases diagnosed between 2010 and 2011 with follow-up until April 2015. Data were drawn from local population-based cancer registries and patients' hospital medical records from all public and private hospitals from two regions in southern Spain.

RESULTS:

There was evidence of regional differences in lung cancer late diagnosis (58% stage IV in Granada vs. 65% in Huelva, p value < 0.001). Among patients with stage I, only 67% received surgery compared with 0.6% of patients with stage IV. Patients treated with a combination of radiotherapy, chemotherapy, and surgery had a 2-year mortality risk reduction of 94% compared with patients who did not receive any treatment (excess mortality risk 0.06; 95% CI 0.02-0.16). Geographical differences in survival were observed between the two regions: 35% vs. 26% at 1-year since diagnosis.

CONCLUSIONS:

The observed geographic differences in survival between regions are due in part to the late cancer diagnosis which determines the use of less effective therapeutic options. Results from our study justify the need for promoting lung cancer early detection strategies and the harmonization of the best practice in lung cancer management and treatment.

 
Belot A, Fowler H, Njagi EN, Luque-Fernandez M-A, Maringe C, Magadi W, Exarchakou A, Quaresma M, Turculet A, Peake MD, Navani N, Rachet B. Association between age, deprivation and specific comorbid conditions and the receipt of major surgery in patients with non-small cell lung cancer in England: A population-based study [Internet]. Thorax 2018; Publisher's VersionAbstract
Introduction We investigated socioeconomic disparities and the role of the main prognostic factors in receiving major surgical treatment in patients with lung cancer in England.Methods Our study comprised 31 351 patients diagnosed with non-small cell lung cancer in England in 2012. Data from the national population-based cancer registry were linked to Hospital Episode Statistics and National Lung Cancer Audit data to obtain information on stage, performance status and comorbidities, and to identify patients receiving major surgical treatment. To describe the association between prognostic factors and surgery, we performed two different analyses: one using multivariable logistic regression and one estimating cause-specific hazards for death and surgery. In both analyses, we used multiple imputation to deal with missing data.Results We showed strong evidence that the comorbidities ‘congestive heart failure’, ‘cerebrovascular disease’ and ‘chronic obstructive pulmonary disease’ reduced the receipt of surgery in early stage patients. We also observed gender differences and substantial age differences in the receipt of surgery. Despite accounting for sex, age at diagnosis, comorbidities, stage at diagnosis, performance status and indication of having had a PET-CT scan, the socioeconomic differences persisted in both analyses: more deprived people had lower odds and lower rates of receiving surgery in early stage lung cancer.Discussion Comorbidities play an important role in whether patients undergo surgery, but do not completely explain the socioeconomic difference observed in early stage patients. Future work investigating access to and distance from specialist hospitals, as well as patient perceptions and patient choice in receiving surgery, could help disentangle these persistent socioeconomic inequalities.
Larrabure-Torrealva GT, Martinez S, Luque-Fernandez MA, Sanchez SE, Mascaro PA, Ingar H, Castillo W, Zumaeta R, Grande M, Motta V, Pacora P, Gelaye B, Williams MA. Prevalence and risk factors of gestational diabetes mellitus: findings from a universal screening feasibility program in Lima, Peru [Internet]. BMC Pregnancy and Childbirth 2018;18(1):303. Publisher's VersionAbstract
Gestational diabetes mellitus (GDM) is a global public health concern with potential implications for the health of a mother and her offspring. However, data on the prevalence and risk factors of GDM in Latin America are scarce.
Luque‐Fernandez MA, Schomaker M, Rachet B, Schnitzer ME. Targeted maximum likelihood estimation for a binary treatment: A tutorial [Internet]. Statistics in Medicine 2018;2018 Publisher's VersionAbstract

When estimating the average effect of a binary treatment (or exposure) on an outcome, methods that incorporate propensity scores, the G‐formula, or targeted maximum likelihood estimation (TMLE) are preferred over naïve regression approaches, which are biased under misspecification of a parametric outcome model. In contrast propensity score methods require the correct specification of an exposure model. Double‐robust methods only require correct specification of either the outcome or the exposure model. Targeted maximum likelihood estimation is a semiparametric double‐robust method that improves the chances of correct model specification by allowing for flexible estimation using (nonparametric) machine‐learning methods. It therefore requires weaker assumptions than its competitors. We provide a step‐by‐step guided implementation of TMLE and illustrate it in a realistic scenario based on cancer epidemiology where assumptions about correct model specification and positivity (ie, when a study participant had 0 probability of receiving the treatment) are nearly violated. This article provides a concise and reproducible educational introduction to TMLE for a binary outcome and exposure. The reader should gain sufficient understanding of TMLE from this introductory tutorial to be able to apply the method in practice. Extensive R‐code is provided in easy‐to‐read boxes throughout the article for replicability. Stata users will find a testing implementation of TMLE and additional material in the Appendix S1 and at the following GitHub repository:

https://github.com/migariane/SIM-TMLE-tutorial

luque-fernandez_et_al-2017-statistics_in_medicine.pdf

Pages