Abstract
There are two general views in causal analysis of experimental data: the super population view that the units are an independent sample from some hypothetical infinite population, and the finite population view that the potential outcomes of the experimental units are fixed and the randomness comes solely from the treatment assignment. These two views differs conceptually and mathematically, resulting in different sampling variances of the usual difference-in-means estimator of the average causal effect. Practically, however, these two views result in identical variance estimators. By recalling a variance decomposition and exploiting a completeness-type argument, we establish a connection between these two views in completely randomized experiments. This alternative formulation could serve as a template for bridging finite and super population causal inference in other scenarios.
1 Introduction
Neyman [1, 2] defined causal effects in terms of potential outcomes, and proposed an inferential framework viewing all potential outcomes of a finite population as fixed and the treatment assignment as the only source of randomness. This finite population view allows for easy interpretation free of any hypothetical data generating process of the outcomes, and is used in a variety of contexts e.g., [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]. This approach is considered desirable because, in particular, it does not assume the data are somehow a representative sample of some larger (usually infinite) population.
Alternative approaches, also using the potential outcomes framework, assume that the potential outcomes are independent and identical draws from a hypothetical infinite population. Mathematical derivations under this approach are generally simpler, but the approach itself can be criticized because of this typically untenable sampling assumption. Furthermore, this approach appears to ignore the treatment assignment mechanism.
That being said, it is well known that the final variance formulae from either approach tend to be quite similar. For example, while the variance of the difference-in-means estimator for a treatment-control experiment under an infinite population model is different from the one under Neyman’s [1] finite population formulation, this difference is easily represented as a function of the variance of the individual causal effects. Furthermore, this difference term is unidentifiable and is often assumed away under a constant causal effect model ([1, 2, 23, 24, 25, 26]), or by appeals to the final estimators being “conservative.”
For the difference in means, the infinite population variance estimate gives a conservative (overly large) estimate of the finite population variance. As deriving infinite population variance expressions, relative to finite population variance expressions, tends to be more mathematically straightforward, we might naturally wonder if we could use infinite population expressions as conservative forms of finite population expressions more generally. In this work we show that in fact we can assume an infinite population model as an assumption of convenience, and derive formula from this perspective. This shows that we can thus consider the resulting formula as focused on the treatment assignment mechanism and not on a hypothetical sampling mechanism, i.e., we show variance derivations under the infinite population framework can be used as conservative estimators in a finite context.
Mathematically, this result comes from a variance decomposition and a completeness-style argument characterizing the connection and the difference between these two views. The variance decomposition we use has previously appeared in Imai [7], Imbens and Rubin [18], and Balzer et al. [27]. The completeness-style argument, which we believe is novel in this domain, then sharpens the variance decomposition by moving from an expression on an overall average relationship to one that holds for any specific sample.
Our overall goal is simple: we wish to demonstrate that if one uses variance formula derived from assuming an infinite population sampling model, then the resulting inference one obtains will be correct with regards to the analogous sample-specific treatment effects (although it could be potentially conservative in that the standard errors may be overly large) regardless of the existence of any sampling mechanism.
2 Super population, finite population, and samples
Assume that random variables
The individual causal effect for unit
At the super population level, the average potential outcomes are
The population variances of the potential outcomes and individual causal effect are
At the finite population level, i.e., for a fixed sample
The corresponding finite population variances of the potential outcomes and individual causal effects are
where, following the tradition of survey sampling [28], we use the divisor
Regardless, we have two parameters – the population average treatment effect
Our primary statistics are the averages of the observed outcomes and the difference-in-means estimator:
We also observe the sample variances of the outcomes under treatment and control using
We do not have the sample analogue of
We summarize the infinite population, finite population and sample quantities in Table 1.
Means | Variances | |||||
---|---|---|---|---|---|---|
Treatment | Control | Effect | Treatment | Control | Effect | |
Super population | ||||||
Finite population | ||||||
Sample |
3 Deriving complete randomization results with an independent sampling model
The three levels of quantities in Table 1 are connected via independent sampling and complete randomization. Neyman [1], without reference to any infinite population and by using the assignment mechanism as the only source of randomness, represented the assignment mechanism via an urn model, and found
He then observed that the final term was unidentifiable but nonnegative, and thus if we dropped it we would obtain an upper bound of the estimator’s uncertainty.
We next derive this result by assuming a hypothetical sampling mechanism from some assumed infinite super-population model of convenience. This alternative derivation of the above result, which can be extended to other assignment mechanisms, shows how we can interpret formulae based on super-population derivations as conservative formulae for finite-sample inference.
3.1 Sampling and randomization
To begin, note that IID sampling of
and third, the sample variances are unbiased estimates of the true variances:
Conditional on
We do not use the notation
If we do not condition on
This is the classic infinite population variance formula for the two sample difference-in-means statistic. We could use it to obtain standard errors by plugging in
3.2 Connecting the finite and infinite population inference with a variance decomposition
We will now extend the above to indirectly derive the result on the variance of
which further implies that the finite population variance of
Compare to the classic variance expression (1), which is this without the expectation. Here we have that on average our classic variance expression holds. Now, because this is true for any infinite population, as it is purely a consequence of the IID sampling mechanism and complete randomization, we can close the gap between eqs (1) and (8). Informally speaking, because eq. (8) holds as an average over many hypothetical super populations, it should also hold for any finite population at hand, and indeed it does, as we next show using a “completeness” concept from statistics [29].
3.3 A “Completeness” argument
First, define
a function of a fixed finite sample
For any given sample
According to eq. (10), the finite population variance
for some
Because
Similarly, because
Use the above to replace our cross terms of
where
Thus,
Because eq. (11) holds for any populations regardless of its values of
4 Discussion
Equation (8) relies on the assumption that the hypothetical infinite population exists, but eq. (1) does not. However, the completeness-style argument allowed us to make our sampling assumption only for convenience in order to prove eq. (1) by, in effect, dropping the expectation on both sides of eq. (8). Similar argument exists in the classical statistics literature; see Efron and Morris [30] for the empirical Bayes view of Stein’s estimator. While the final result is, of course, not new, we offer it as it gives an alternative derivation that does not rely on asymptotics such as a growing super population or a focus on the properties of the treatment assignment mechanism.
Using Freedman’s [8] results, Aronow et al. [12] considered a super population with
This decomposition approach also holds for other types of experiments. First, for a stratified experiment, each stratum is essentially a completely randomized experiment. Apply the result to each stratum, and then average over all strata to obtain results for a stratified experiment. Second, because a matched-pair experiment is a special case of a stratified experiment with two units within each stratum, we can derive the Neyman-type variance cf. ([7, 18]) directly from that of a stratified experiment. Third, a cluster-randomized experiment is a completely randomized experiment on the clusters. If the causal parameters can be expressed as cluster-level outcomes, then the result can be straightforwardly applied cf.([11, 17]). Fourth, for general experimental designs, the variance decomposition in eq. (6) still holds, and therefore we can modify the derivation of the finite population variance according to different forms of eqs (4) and (5).
In a completely randomized experiment, the finite population sampling variance of
Our discussion is based on the frequentists’ repeated sampling evaluations of the difference-in-means estimator for the average causal effect. In contrast, Fisher [34] proposed the randomization test against the sharp null hypothesis that
Acknowledgements
We thank Dr. Peter Aronow (the Associate Editor) and three anonymous reviewers for helpful comments.
References
1. Neyman J. On the application of probability theory to agricultural experiments. essay on principles (with discussion). Section 9 (translated). Reprinted ed. Stat Sci 1923;5:465–72.Search in Google Scholar
2. Neyman J. Statistical problems in agricultural experimentation (with discussion). J Roy Stat Soc 1935;2:107–80.Search in Google Scholar
3. Kempthorne O. The design and analysis of experiments. New York: John Wiley and Sons, 1952.10.1097/00010694-195205000-00012Search in Google Scholar
4. Hinkelmann K, Kempthorne O. Design and analysis of experiments, volume 1: introduction to experimental design, 2nd ed. New Jersey: John Wiley & Sons, Inc., 2008.Search in Google Scholar
5. Copas J. Randomization models for the matched and unmatched 2×2 tables. Biometrika 1973;60:467–76.Search in Google Scholar
6. Rosenbaum PR. Observational studies, 2nd ed. New York: Springer, 2002.10.1007/978-1-4757-3692-2Search in Google Scholar
7. Imai K. Variance identification and efficiency analysis in randomized experiments under the matched-pair design. Stat Med 2008;27:4857–73.10.1002/sim.3337Search in Google Scholar PubMed
8. Freedman DA. On regression adjustments in experiments with several treatments. Ann Appl Stat 2008a;2:176–96.10.1214/07-AOAS143Search in Google Scholar
9. Freedman DA. Randomization does not justify logistic regression. Stat Sci 2008b;23:237–49.10.1214/08-STS262Search in Google Scholar
10. Rosenbaum PR. Design of observational studies. New York: Springer, 2010.10.1007/978-1-4419-1213-8Search in Google Scholar
11. Aronow PM, Middleton JA. A class of unbiased estimators of the average treatment effect in randomized experiments. J Causal Inference 2013;1:135–54.10.1515/jci-2012-0009Search in Google Scholar
12. Aronow PM, Green DP, Lee DK. Sharp bounds on the variance in randomized experiments. Ann Stat 2014;42:850–71.10.1214/13-AOS1200Search in Google Scholar
13. Abadie A, Athey S, Imbens GW, Wooldridge JM. Finite population causal standard errors. Technical report, National Bureau of Economic Research, 2014.10.3386/w20325Search in Google Scholar
14. Miratrix LW, Sekhon JS, Yu B. Adjusting treatment effect estimates by post-stratification in randomized experiments. J Roy Stat Soc Ser B (Stat Methodol) 2013;75:369–96.10.1111/j.1467-9868.2012.01048.xSearch in Google Scholar
15. Ding P. A paradox from randomization-based causal inference (with discussion). Stat Sci 2017. (in press).10.1214/16-STS571Search in Google Scholar
16. Lin W. Agnostic notes on regression adjustments to experimental data: reexamining Freedman’s critique. Ann Appl Stat 2013;7:295–318.10.1214/12-AOAS583Search in Google Scholar
17. Middleton JA, Aronow PM. Unbiased estimation of the average treatment effect in cluster-randomized experiments. Stat, Politics Policy 2015;6:39–75.10.1515/spp-2013-0002Search in Google Scholar
18. Imbens GW, Rubin DB. Causal inference for statistics, social and biometrical sciences: an introduction. Cambridge: Cambridge University Press, 2015.10.1017/CBO9781139025751Search in Google Scholar
19. Chiba Y. Exact tests for the weak causal null hypothesis on a binary out come in randomized trials. J Biometrics Biostatistics 2015;6. doi:10.4172/2155–6180.1000244.Search in Google Scholar
20. Rigdon J, Hudgens MG. Randomization inference for treatment effects on a binary outcome. Stat Med 2015;34:924–35.10.1002/sim.6384Search in Google Scholar PubMed PubMed Central
21. Li X, Ding P. Exact confidence intervals for the average causal effect on a binary outcome. Stat Med 2016;35:957–60.10.1002/sim.6764Search in Google Scholar PubMed
22. Li X, Ding P. General forms of finite population central limit theorems with applications to causal inference. J Am Stat Assoc 2017. (in press).10.1080/01621459.2017.1295865Search in Google Scholar
23. Hodges JL, Lehmann EL. Basic concepts of probability and statistics. San Francisco: Holden-Day, 1964.Search in Google Scholar
24. Rubin DB. Comment: Neyman (1923) and causal inference in experiments and observational studies. Stat Sci 1990;5:472–80.10.1214/ss/1177012032Search in Google Scholar
25. Reichardt CS, Gollob HF. Justifying the use and increasing the power of a t test for a randomized experiment with a convenience sample. Psychol Methods 1999;4:117–28.10.1037/1082-989X.4.1.117Search in Google Scholar
26. Freedman D, Pisani R, Purves R. Statistics, 4th ed. New York: W. W. Norton & Company, 2007.Search in Google Scholar
27. Balzer LB, Petersen ML, Laan MJ. Targeted estimation and inference for the sample average treatment effect in trials with and without pair-matching. Stat Med 2016;35:3717–32.10.1002/sim.6965Search in Google Scholar PubMed PubMed Central
28. Cochran WG. Sampling techniques, 3rd ed. New York: John Wiley & Sons, 1977.Search in Google Scholar
29. Lehmann EL, Romano JP. Testing statistical hypotheses, 3rd ed. New York : Wiley, 2008.Search in Google Scholar
30. Efron B, Morris C. Stein’s estimation rule and its competitors – an empirical Bayes approach. J Am Stat Assoc 1973;68:117–130.Search in Google Scholar
31. Robins JM. Confidence intervals for causal parameters. Stat Med 1988;7:773–85.10.1002/sim.4780070707Search in Google Scholar PubMed
32. Ding P, Dasgupta T. A potential tale of two by two tables from completely randomized experiments. J Am Stat Assoc 2016;111:157–68.10.1080/01621459.2014.995796Search in Google Scholar
33. Samii C, Aronow PM. On equivalencies between design-based and regression-based variance estimators for randomized experiments. Stat Probab Lett 2012;82:365–70.10.1016/j.spl.2011.10.024Search in Google Scholar
34. Fisher RA. The design of experiments, 1st ed. Edinburgh, London: Oliver and Boyd, 1935.Search in Google Scholar
35. Lehmann EL. Nonparametrics: statistical methods based on ranks, 1st ed. San Francisco: Holden-Day, 1975.Search in Google Scholar
© 2017 Walter de Gruyter GmbH, Berlin/Boston
This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.