Uncovering Household Self-Targeting with Machine Learning


I investigate the extent to which households in Colombia manipulate their eligibility for a social program.  Eligibility is determined by a poverty score that is calculated from answers to a household survey, the formula for which was released four years after the start of the program.  Because proxy-means testing systems can potentially predict household poverty poorly, households have incentives to manipulate their eligibility.  I find that, as in Camacho and Conover (2011), there is a significant discontinuity at the eligibility cutoff and that one method households use to manipulate their eligibility is by having their poverty score overwritten.   I then use machine-learning techniques to predict households’ actual poverty level.  I find that households who manipulate their score are more likely to be poor than households with the same score and more likely to be poorly predicted by the government’s poverty score.  These findings suggest that not all proxy-means testing systems predict household poverty well and when they do not, households self-target by manipulating their eligibility.

Last updated on 12/16/2019