# Publications

Some empirical results are more likely to be published than others. Such selective publication leads to biased estimators and distorted inference. This paper proposes two approaches for identifying the conditional probability of publication as a function of a study's results, the first based on systematic replication studies and the second based on meta-studies. For known conditional publication probabilities, we propose median-unbiased estimators and associated confidence sets that correct for selective publication. We apply our methods to recent large-scale replication studies in experimental economics and psychology, and to meta-studies of the effects of minimum wages and de-worming programs.

Many applied settings in empirical economics involve simultaneous estimation of a large number of parameters. In particular, applied economists are often interested in estimating the effects of many-valued treatments (like teacher effects or location effects), treatment effects for many groups, and prediction models with many regressors. In these settings, machine learning methods that combine regularized estimation and data-driven choices of regularization parameters are useful to avoid over-fitting. In this article, we analyze the performance of a class of such methods that includes ridge, lasso, and pretest, in contexts that require simultaneous estimation of many parameters. Our analysis aims to provide guidance to applied researchers on (i) the choice between regularized estimators in practice and (ii) data-driven selection of regularization parameters. To address (i), we characterize the risk (mean squared error) of regularized estimators and derive their relative performance as a function of simple features of the data generating process. To address (ii), we show that data-driven choices of regularization parameters, based on Stein's unbiased risk estimate or on cross-validation, yield estimators with risk uniformly close to the risk attained under the optimal (unfeasible) choice of regularization parameters. We use data from recent examples in the empirical economics literature to illustrate the practical applicability of our results.

We propose to use economic theories to construct estimators that perform well when the theories' empirical implications are approximately correct, but are robust even if the theories are completely wrong. We describe a general construction of such estimators using the empirical Bayes paradigm. We implement this construction in various settings, including labor demand and wage inequality, asset pricing, economic decision theory, and structural discrete choice models. We provide theoretical characterizations of the behavior of the proposed estimators, and evaluate them using Monte Carlo simulations. Our approach is an alternative to the use of theory as something to be tested or to be imposed on estimates. Our approach complements uses of theory for identification and extrapolation.

The incidence of tax and other policy changes depends on their impact on equilibrium wages.The impact of wage changes on a worker's welfare equals current labor supply times the induced wage change, in a standard model of labor supply. Worker heterogeneity implies that wage changes vary across workers. In this context, in order to identify welfare effects one needs to identify the conditional causal effect of policy changes on wages given baseline labor supply and wages. This paper characterizes identification of such conditional causal effects for general vectors of endogenous outcomes. Even with exogenous policy variation, conditional causal effects are only partially identified for outcome vectors of dimension larger than one. We provide assumptions restricting heterogeneity of effects just enough for point-identification and propose corresponding estimators. This paper then applies the proposed methods to analyze the distributional welfare impact (i) of the expansion of the Earned Income Tax Credit (EITC) in the 1990s, using variation in state supplements in order to identify causal effects, and (ii) of historical changes of the wage distribution in the US in the 1990s. For the EITC, we find negative welfare effects of depressed wages as a consequence of increased labor supply, in particular for individuals earning around 20.000$ per year. Looking at historical changes, we find modest welfare gains rising linearly with earnings.

How should one use (quasi-)experimental evidence when choosing policies such as top tax rates, health insurance coinsurance rates, unemployment benefit levels, class sizes in schools, etc.? This paper provides an answer that combines insights from (i) optimal policy theory as developed in the field of public finance, and (ii) machine learning using Gaussian process priors. We propose to choose policies which maximize posterior expected social welfare. We provide explicit formulas for posterior expected social welfare and optimal policies in a wide class of policy problems.

The proposed methods are applied to the choice of coinsurance rates in health insurance, using the data of the RAND health insurance experiment. The key tradeoff in this setting is between redistribution toward the sick and insurance revenues. The key empirical relationship the policymaker needs to learn about is the response of health care expenditures to coinsurance rates. Holding everything constant except the estimation method, we obtain much smaller estimates of the optimal coinsurance rate (18% vs. 50%) than those obtained using a conventional ``sufficient statistic'' approach.

We study the effect of interview modes on estimates of economic inequality which are based on survey data. We exploit variation in interview modes in the Austrian EU-SILC panel, where between 2007 and 2008 the interview mode was switched from personal interviews to telephone interviews for some but not all participants. We combine methods from the program evaluation literature with methods from the distributional decomposition literature to obtain causal estimates of the effect of interview mode on estimated inequality. We find that the interview mode has a large effect on estimated inequality, where telephone interviews lead to a larger downward bias. The effect of the mode is much smaller for robust inequality measures such as interquantile ranges, as these are not sensitive to the tails of the distribution. The magnitude of effects we find are of a similar order as the differences in many international and intertemporal comparisons of inequality.

When are asymptotic approximations using the delta-method uniformly valid? We provide sufficient conditions as well as closely related necessary conditions for uniform negligibility of the remainder of such approximations. These conditions are easily verified and permit to identify settings and parameter regions where pointwise asymptotic approximations perform poorly. Our framework allows for a unified and transparent discussion of uniformity issues in various sub-fields of econometrics. Our conditions involve uniform bounds on the remainder of a first-order approximation for the function of interest.

This paper discusses experimental design for the case that (i) we are given a distribution of covariates from a pre-selected random sample, and (ii) we are interested in the average treatment effect (ATE) of some binary treatment. We show that in general there is a unique optimal non-random treatment assignment if there are continuous covariates. We argue that experimenters should choose this assignment. The optimal assignment minimizes the risk (e.g., expected squared error) of treatment effects estimators. We provide explicit expressions for the risk, and discuss algorithms which minimize it. The objective of controlled trials is to have treatment groups which are similar a priori (balanced), so we can ``compare apples with apples.'' The expressions for risk derived in this paper provide an operationalization of the notion of balance. The intuition for our non-randomization result is similar to the reasons for not using randomized estimators - adding noise can never decrease risk. The formal setup we consider is decision-theoretic and nonparametric. In simulations and an application to project STAR we find that optimal designs have mean squared errors of up to 20% less than randomized designs and up to 14% less than stratified designs..

Many methodological debates in microeconometrics are driven by the tension between ``what we can get'' (identification) and ``what we want'' (parameters of interest). This paper proposes to consider models of policy choice which allow for a joint formal discussion of both issues. We consider a non-standard empirical object of interest, the ranking of counterfactual policies. This paper connects the literatures on partial identification and on ambiguity, where partially identified policy rankings are formally analogous to choice under Knightian uncertainty. Partial identification of conditional average treatment effects maps into a partial ordering of treatment assignment policies in terms of social welfare. This paper gives geometric characterizations of the identified partial ordering of policies, and derives conditions for restricted policy sets to be completely ordered or completely unordered. Such conditions map sets of feasible policies into requirements on data that allow to rank these policies. Generalizing to non-linear objective functions, it is then shown that policy effects are partially identified if and only if the policy objective is a robust statistic in the sense of having a bounded influence function. Furthermore, rankings derived from a linearized version of the objective function give correct rankings in a neighborhood of a status quo policy, and are easy to calculate in practice. The theoretical results of this paper are applied to data from the ``project STAR'' experiment, in which children were randomly assigned to classes of different sizes. This application illustrates the dependence of identifiability of the policy ranking on identifying assumptions, the feasible policy set, and distributional preferences.

This paper discusses nonparametric identification in a model of sorting in which location choices depend on the location choices of other agents as well as prices and exogenous location characteristics. In this model, demand slopes and hence preferences are not identifiable without further restrictions because of the absence of independent variation of endogenous composition and exogenous location characteristics. Several solutions of this problem are presented and applied to data on neighborhoods in US cities. These solutions use exclusion restrictions, based on either subgroup demand shifters, the spatial structure of externalities, or the dynamics of prices and composition in response to an amenity shock. The empirical results consistently suggest the presence of strong social externalities, that is a dependence of location choices on neighborhood composition.

This paper proposes an estimator and develops an inference procedure for the number of roots of functions which are nonparametrically identified by conditional moment restrictions. It is shown that a smoothed plug-in estimator of the number of roots is super-consistent under i.i.d. asymptotics, but asymptotically normal under non-standard asymptotics. The smoothed estimator is furthermore asymptotically efficient relative to a simple plug-in estimator. The procedure proposed is used to construct confidence sets for the number of equilibria of static games of incomplete information and of stochastic difference equations. In an application to panel data on neighborhood composition in the United States, no evidence of multiple equilibria is found.

This paper discusses identification in continuous triangular systems without restrictions on heterogeneity or functional form. In particular, we do not assume separability of structural functions, restrictions on the dimensionality of unobservables, or monotonicity in unobservables. We do maintain monotonicity of the first stage relationship in the instrument and consider the case of real-valued treatment. We show that under these conditions alone, and given rich enough support of the data, we can achieve point identification of potential outcome distributions, and in particular of the average structural function and quantile structural functions. If the support of the continuous instrument is not large enough potential outcome distributions are partially identified. If the instrument is discrete identification fails completely. If treatment is multidimensional, additional exclusion restrictions allow to achieve identification. The setup discussed in this paper covers important cases not covered by existing approaches such as conditional moment restrictions (cf. Newey and Powell 2003) and control variables (cf. Imbens and Newey 2009). It covers, in particular, random coefficient models, as well as models arising as the reduced form of a system of structural equations.

Changes in family structures, such as the composition of households with respect to size, age and gender, can have an impact on poverty rates and the income distribution more generally. We analyze the impact of changing family structures on the income distribution among adult Costa Rican women between 1993 and 2009, using decomposition methods. There was a general increase in the share of family structures associated with lower incomes (singles with dependents) until 2001. After 2001, this trend reversed for women at the upper end of the income distribution, while it continued for women at the lower end. Correspondingly, we find a general negative effect of changing family structures on incomes of adult women until 2001, and an inequality-increasing effect after 2001. The change in trends might be due to a law coming into force in 2001 and which mandated DNA tests for presumptive fathers unwilling to recognize their children.

While the empirical literature on intergenerational mobility is politically controversial, it is not obvious what the implications of intergenerational status transmission for optimal policy are. Addressing this question, this paper studies the local comparative statics of optimal income taxes with respect to parameters of intergenerational transmission. The model used extends standard models of optimal linear income taxation, adding a parental preference for child earnings capability, an educational investment opportunity, and credit constraints. We find that the optimal degree of redistribution, everything else equal, is increasing in the curvature of intergenerational transmission. This is because the non-linearity in the household budget set affects the curvature of household indirect utility as a function of virtual income. In contrast, the implications of stronger transmission for redistribution are ambiguous. The strength of transmission matters, however, for the optimal government budget deficit.