The last five years have seen an explosion in the amount of data available to social scientists. Although a blessing, these extremely large sources of data can cause problems for political scientists working with standard statistical software programs, which are poorly suited to analyzing big data sets. In this essay, we describe a few approaches to handling extremely large datasets within the R programming language, both at the command line prior to R and after we fire up R. We show that handling large datasets is about either (1) choosing tools that can shrink the problem or (2) fine-tuning R to handle massive data files.
The recent subprime mortgage crisis has brought to the forefront the possibility of discriminatory lending on the basis of race or gender. Using the over 10 million observations collected by the federal government in 2006 through the Home Mortgage Disclosure Act, this paper explores these claims causally. In so doing, the paper explores two possible theories of discrimination: (1) that any discriminatory lending patterns are picking up the fact that minority borrowers went to different lenders, perhaps as a result of predatory lending, and (2) the possibility that individual lenders discriminated against identically situated borrowers. The results presented provide limited evidence for the idea that borrowers of different races went to different lenders, but only in certain regions of the country and only for certain minority groups. In addition, many of these results are sensitive to missing confounders – e.g., financial data like credit scores and down payments, which the federal government does not collect. Ultimately, the results’ sensitivity suggests that more data gathering is in order before definitive assertions can be made by legal and policy actors.
maya_senfrom our paper: A Republican spokeswoman rejected the letter as `business as usual for the same far-left
academics who trot out letters opposing just about any conservative or Republican who’s nominated to a key position by a Republican president'
(Sessions was obvs confirmed)
maya_sen@adamschilton@kyle_rozema@adam_bonica the problem with this is that when expertise so closely correlates with ideology/partisanship, it will be quickly dismissed -- as an example, we start the paper with the fact that 1,400 law profs wrote a letter to oppose Jeff Sessions as AG under Trump