Working Paper
Glynn AN, Rueda M. Post-Instrument Bias. Working Paper.Abstract

Post-instrument covariates are often included in IV analyses to address a violation of the exclusion restriction. We demonstrate that even in linear constant-effects models with large samples: 1) invariance between IV estimates (with and without post-instrument covariates) does not imply that the exclusion restriction holds with respect to the post-instrument covariate, 2) OLS with an omitted variable will often have less bias than IV with the post-instrument covariate, 3) measurement error in the post- instrument covariate does not necessarily lead to attenuation, and 4) the bias of OLS and IV are related. Therefore, if used, IV with a post-instrument covariate should always be paired with OLS, and results should be discussed in concert. We illustrate these points with a re-analysis of Acemoglu, Johnson, and Robinson (2001), showing that for the paper’s claims to be valid, at least 35% of the variance in the causal variable must be due to measurement error. 

Blackwell M, Glynn AN. How to Make Causal Inferences with Time-Series Cross-Sectional Data under Selection on Observables. Working Paper. causal-tscs-v14.pdf
Glynn AN, Kashin K. Front-door Versus Back-door Adjustment with Unmeasured Confounding: Bias Formulas for Front-door and Hybrid Adjustments. Working Paper.Abstract
In this paper, we develop bias formulas for front-door estimates and front-door/back- door hybrid estimates of average treatment effects under general patterns of measured and unmeasured confounding. These bias formulas allow for sensitivity analysis, and also allow for comparisons of the bias resulting from standard back-door covariate ad- justments (also known as direct adjustment and standardization). We also present these bias comparisons in two special cases: linear structural equation models and nonrandomized program evaluations with one-sided noncompliance. These compar- isons demonstrate that there are broad classes of applications for which the front-door or hybrid adjustments will be preferred to the back-door adjustments. We illustrate this point with an application to the National JTPA (Job Training Partnership Act) Study, showing that by using information on enrollment in addition to pre-treatment covariates, the front-door approach provides estimates that are closer to the experi- mental benchmark than the back-door approach.
Glynn AN, Quinn KM. Structural Causal Models and the Specification of Time-Series-Cross-Section Models. Submitted.Abstract
The structural causal models (SCM) of Pearl (1995, 2000, 2009) provide a graphical criterion for choosing the “right hand side” variables to include in a model. In this paper, we use SCMs to address the question of whether to include lagged variables in time-series-cross-section (TSCS) models. This question has received a great deal of attention from political methodologists, but unfortunately, the practical advice for applied researchers that comes out of this literature varies considerably from article to article. We attempt to clarify the nature of some of these disagreements and to provide useful tools to reason about the nonparametric identification of causal effects. After clarifying the debate between Beck and Katz (1996, 2011) and Achen (2000) and adding to the discussion by Keele and Kelly (2006), we provide concrete nonparametric identification results for commonly studied TSCS data generating processes. These results are also relevant for the choice of control variables in cross-section (CS) models. We conclude with some general thoughts on how a focus on using the SCM as a tool for proving identification results can help TSCS and CS researchers do better work.
Gerring J, Glynn AN. Strategies of Research Design with Confounding: A Graphical Description. Submitted.Abstract

Research design is of paramount importance when attempting to overcome con- founding. In this paper, we propose a unified graphical approach for the consideration of cross-sectional research designs. Specifically, we argue that at least five distinct strategies may be discerned for coping with the presence of a common-cause con- founder: (1) blocking backdoor paths, (2) mechanisms, (3) instrumental variables, (4) alternate outcomes, and (5) causal heterogeneity. All of these strategies enlist a facil- itating variable, whose role defines the corresponding research design. This resulting framework builds on the foundational work of Pearl (2000, 2009) but incorporates addi- tional research designs into the graphical framework, providing a more comprehensive typology of designs for causal inference.

Glynn AN. Does Oil Cause Civil War Because It Causes State Weakness?. 2009.Abstract

Conflict scholars have devoted considerable attention to the natural resource curse, and specifically to connections between natural resources, state weakness, and civil war. Many have posited a state weakness mechanism-- that significant oil production causes state weakness, and state weakness consequently increases the likelihood of civil war onset. Using standard measures, this paper demonstrates that the state weakness mechanism does not exist in the short or medium term. The methods developed in this paper show that in only two cases is there the possibility of a medium term effect, and the state weakness mechanism is unlikely to be operative even in these two cases. Furthermore, these methods do not rely on assumptions about unmeasured confounders, so this result is robust to the consideration of other risk factors for civil war onset. The state weakness mechanism may still exist in the form of long term effects or an effect that reinforces pre-existing war and/or state weakness. However, the null hypothesis of no long-term and/or reinforcing effect cannot be rejected without the use of additional assumptions.

Glynn A, Wakefield J, Handcock M, Richardson T. Alleviating Ecological Bias in Voter Turnout Models (and other Generalized Linear Models) with Optimal Subsample Design. 2009.Abstract
In this paper, we illustrate that combining ecological data with subsample data in situations in which a generalized linear model (GLM) is appropriate provides two main benefits. First, by including the individual level subsample data, the biases associated with ecological inference in GLMs can be eliminated. Second, available ecological data can be used to design optimal subsampling schemes, so as to maximize information about parameters. We present an application of this methodology to voter turnout studies showing that small, optimally chosen subsamples can be combined with ecological data to generate precise estimates relative to a simple random subsample, and we discuss possible applications in epidemiology.
Journal Article
Glynn AN, Kashin K. Front-door Difference-in-Differences Estimators. American Journal of Political Science. Forthcoming.Abstract

We develop front-door difference-in-differences estimators as an extension of front-door estimators. Under one-sided noncompliance, an exclusion restriction, and assumptions anal- ogous to parallel trends assumptions, this extension allows identification when the front-door criterion does not hold. Even if the assumptions are relaxed, we show that the front-door and front-door difference-in-differences estimators may be combined to form bounds. Finally, we show that under one-sided noncompliance, these techniques do not require the use of control units. We illustrate these points with an application to a job training study and with an applica- tion to Florida’s early in-person voting program. For the job training study, we show that these techniques can recover an experimental benchmark. For the Florida program, we find some ev- idence that early in-person voting had small positive effects on turnout in 2008. This provides a counterpoint to recent claims that early voting had a negative effect on turnout in 2008. 

Glynn AN, Ichino N. Increasing Inferential Leverage in the Comparative Method: Placebo Tests in Small-n Research. Sociological Methods & Research. 2016;45 (3) :598-629.Abstract
We explicitly delineate the underlying homogeneity assumption, procedural variants, and implications of the comparative method [Lijphart, 1975] and distinguish this from Mill’s method of difference [1872]. We demonstrate that additional units can provide “placebo” tests for the comparative method even if the scope of inference is limited to the two units under comparison. Moreover, such tests may be available even when these units are the most similar pair of units on the control variables with differing values of the independent variable. Small-n analyses using this method should therefore, at a minimum, clearly define the dependent, independent, and control variables so they may be measured for additional units, and specify how the control variables are weighted in defining similarity between units. When these tasks are too difficult, process tracing of a single unit may be a more appropriate method. We illustrate these points with applications to Epstein [1964] and Moore [1966].
Glynn AN, Ichino N. Using Qualitative Information to Improve Causal Inference. American Journal of Political Science. 2015;59 :1055 - 1071.Abstract

Using the Rosenbaum (2002; 2009) approach to observational studies, we show how qualitative information can be incorporated into quantitative analyses to improve causal inference in three ways. First, by including qualitative information on outcomes within matched sets, we can ameliorate the consequences of the difficulty of measuring those outcomes, sometimes reducing p-values. Second, additional information across matched sets enables the construction of qualitative confidence intervals on effect size. Third, qualitative information on unmeasured confounders within matched sets reduces the conservativeness of Rosenbaum-style sensitivity analysis. This approach accommodates small to medium sample sizes in a nonparametric framework, and therefore may be particularly useful for analyses of the effects of policies or institutions in a given set of units. We illustrate these methods by examining the effect of using plurality rules in transitional presidential elections on opposition harassment in 1990s sub-Saharan Africa. 

Glynn AN, Sen M. Identifying Judicial Empathy: Does Having Daughters Cause Judges to Rule for Women’s Issues?. American Journal of Political Science. 2015;59 (1) :37-54.Abstract

In this article, we consider whether personal relationships can affect the way that judges decide cases. To do so, we leverage the natural experiment of a child's gender to identify the effect of having daughters on the votes of judges. Using new data on the family lives of U.S. Courts of Appeals judges, we find that, conditional on the number of children a judge has, judges with daughters consistently vote in a more feminist fashion on gender issues than judges who have only sons. This result survives a number of robustness tests and appears to be driven primarily by Republican judges. More broadly, this result demonstrates that personal experiences influence how judges make decisions, and this is the first article to show that empathy may indeed be a component in how judges decide cases.

Glynn AN, Wakefield J. Alleviating Ecological Bias in Poisson Models using Optimal Subsampling: The Effects of Jim Crow on Black Illiteracy in the Robinson Data. Sociological Methodology. 2014;44 :159-172.Abstract
In many situations data are available at the group level but one wishes to estimate the individual-level association between a response and an explanatory variable. Unfortunately this endeavor is fraught with difficulties because of the ecological level of the data. The only reliable solution to such ecological inference problems is to supplement the ecological data with individual-level data. In this paper we illustrate the benefits of gathering individual-level data in the context of a Poisson modeling framework. Additionally, we derive optimal designs that allow the individual samples to be chosen so that information is maximized. The methods are illustrated using Robinson's classic data on illiteracy rates. We show that the optimal design produces accurate inference with respect to estimation of relative risks, with ecological bias removed.
Glynn AN. What Can We Learn with Statistical Truth Serum? Design and Analysis of the List Experiment. Public Opinion Quarterly. 2013;77 :159-172.Abstract
Due to the inherent sensitivity of many survey questions, a number of researchers have adopted an indirect questioning technique known as the list experiment (or the item count technique) in order to minimize bias due to dishonest or evasive responses. However, standard practice with the list experiment requires a large sample size, is not readily adaptable to regression or multivariate modeling, and provides only limited diagnostics. This paper addresses all three of these issues. First, the paper presents design principles for the standard list experiment (and the double list experiment) to minimize bias and reduce variance as well as providing sample size formulas for the planning of studies. Additionally, this paper investigates the properties of a number of estimators and introduces an easy-to-use piecewise estimator that reduces necessary sample sizes in many cases. Second, this paper proves that standard-procedure list experiment data can be used to estimate the probability that an individual holds the socially undesirable opinion/behavior. This allows multivariate modeling. Third, this paper demonstrates that some violations of the behavioral assumptions implicit in the technique can be diagnosed with the list experiment data. The techniques in this paper are illustrated with examples from American politics.
Glynn AN. The Product and Difference Fallacies for Indirect Effects. American Journal of Political Science. 2012;56 (1) :257-269.Abstract

Political scientists often cite the importance of mechanism-specific causal knowledge, both for its intrinsic scientific value and as a necessity for informed policy. This article explains why two common inferential heuristics for mechanism-specific (i.e., indirect) effects can provide misleading answers, such as sign reversals and false null results, even when linear regressions provide unbiased estimates of constituent effects. Additionally, this article demonstrates that the inferential difficulties associated with indirect effects can be ameliorated with the use of stratification, interaction terms, and the restriction of inference to subpopulations (e.g., the indirect effect on the treated). However, indirect effects are inherently not identifiable— even when randomized experiments are possible. The methodological discussion is illustrated using a study on the indirect effect of Islamic religious tradition on democracy scores (due to the subordination of women).

Glynn AN, Quinn KM. Why Process Matters for Causal Inference. Political Analysis. 2011;19 (3) :273-286.Abstract

Our goal in this paper is to provide a formal explanation for how within-unit causal process information (i.e., data on posttreatment variables and partial information on posttreatment counterfactuals) can help to in- form causal inferences relating to total effects—the overall effect of an explanatory variable on an outcome variable. The basic idea is that, in many applications, researchers may be able to make more plausible causal assumptions conditional on the value of a posttreatment variable than they would be able to do unconditionally. As data become available on a posttreatment variable, these conditional causal assumptions become active and information about the effect of interest is gained. This approach is most beneficial in situations where it is implausible to assume that treatment assignment is conditionally ignorable. We illustrate the approach with an example of estimating the effect of election day registration on turnout.

Glynn AN, Richardson TS, Handcock MS. Resolving Contested Elections: The Limited Power of Post-Vote Voice-Choice Data. Journal of the American Statistical Association. 2010;105 :84-91.Abstract
In close elections, the losing side has an incentive to obtain evidence that the election result is incorrect. Sometimes this evidence comes in the form of court testimony from a sample of invalid voters, and this testimony is used to adjust vote totals (Belcher v. Mayor of Ann Arbor 1978; Borders v. King County 2005). However, while courts may be reluctant to make explicit findings about out-of-sample data (e.g., invalid voters that do not testify), when samples are used to adjust vote totals, the court is making such findings implicitly. In this paper, we show that the practice of adjusting vote totals on the basis of potentially unrepresentative samples can lead to incorrectly voided election results. More generally, we demonstrate that even when frame error and measurement error are minimal, random samples of post-vote vote-choice data can have limited power to detect incorrect election results without high response rates, precinct level polarization, or the acceptance of large Type I error rates. Therefore, in U.S. election disputes, even high-quality post-vote vote-choice data may be insufficient to resolve contested elections without the use of modeling assumptions (whether or not these assumptions are acknowledged).
Glynn AN, Wakefield J. Ecological Inference in the Social Sciences. Statistical Methodology. 2010;7 (3) :307-322.Abstract
Ecological inference is a problem of partial identification, and therefore precise conclusions are rarely possible without the collection of individual level (identifying) data. Without such data, sensitivity analyses provide the only recourse. In this paper we review and critique recent approaches to ecological inference in the social sciences, and describe in detail hierarchical models, which allow both sensitivity analysis and the incorporation of individual level data into an ecological analysis. A crucial element of a sensitivity analysis in such models is prior specification, and we detail how this may be carried out. Furthermore, we demonstrate how the inclusion of a small amount of individual level data from a small number of ecological areas can dramatically improve the properties of such estimates.
Glynn AN, Quinn KM. An Introduction to the Augmented Inverse Propensity Weighted Estimator. Political Analysis. 2010;18 (1) :36-56.Abstract
In this paper we discuss an estimator for average treatment effects known as the augmented inverse propensity weighted (AIPW). This estimator has attractive theoretical properties and only requires practitioners to do two things they are already comfortable with: (1) specify a binary regression model for the propensity score, and (2) specify a regression model for the outcome variable. After explaining the AIPW estimator, we conduct a Monte Carlo experiment that compares the performance of the AIPW estimator to three common competitors: a regression estimator, an inverse propensity weighted (IPW) estimator, and a propensity score matching estimator. The Monte Carlo results show that the AIPW estimator is dramatically superior to the other estimators in many situations and at least as good as the other estimators across a wide range of data generating processes.
Glynn A, Wakefield J, Handcock M, Richardson T. Alleviating Linear Ecological Bias and Optimal Design with Subsample Data. Journal of the Royal Statistical Society: Series A. 2008;171 (1) :179-202.Abstract
In this paper, we illustrate that combining ecological data with subsample data in situations in which a linear model is appropriate provides two main benefits. First, by including the individual level subsample data, the biases associated with linear ecological inference can be eliminated. Second, we can use readily available ecological data to design optimal subsampling schemes, so as to maximize information about parameters. We present an application of this methodology to the classic problem of estimating the effect of a college degree on wages, showing that small, optimally chosen subsamples can be combined with ecological data to generate precise estimates relative to a simple random subsample.