We explore the dynamics of competitive search in the K-12 public education sector. Using data from Boston Public Schools, we document how teacher labor supply varies substantially by position types, schools, and the timing of job postings. We find that early-posted positions are more likely to be filled and end up securing new hires that are better-qualified, more-effective, and more likely to remain at a school. In contrast, the number of applicants to a position is largely unassociated with hire quality, suggesting that schools may struggle to identify and select the best candidates even when there is a large pool of qualified applicants. Our findings point to substantial unrealized potential for improving teacher hiring.
Starting in 2009, the U.S. public education system undertook a massive effort to institute new high-stakes teacher evaluation systems. We examine the effects of these reforms on student achievement and attainment at a national scale by exploiting the staggered timing of implementation across states. We find precisely estimated null effects, on average, that rule out impacts as small as 1.5 percent of a standard deviation for achievement and 1 percentage point for high school graduation and college enrollment. We also find little evidence of heterogeneous effects across an index measuring system design rigor, specific design features, and district characteristics.
We document a largely unrecognized pathway through which schools promote human capital development – by fostering informal mentoring relationships between students and school personnel. Using longitudinal data from a large, nationally representative sample of adolescents, we explore the frequency, nature, and consequences of school-based natural mentorships. Estimates across a range of fixed effect (FE) specifications, including student FE and twins FE models, consistently show that students with school-based mentors achieve greater academic success and higher levels of post-secondary attainment. These apparent benefits are evident for students across a wide range of backgrounds but are largest for students of lower socioeconomic status.
We examine the dynamic nature of student-teacher match quality by studying the eect of having a teacher for more than one year. Using state-wide data from Tennessee and panel methods, we nd that having a repeat teacher improves achievement and decreases absences, truancy, and suspensions. These results are robust to a range of tests for teacher and student sorting. White girls benet most academically from repeat teachers and boys of color benet most behaviorally. Effects increase with the share of repeat students in a teachers class suggesting that intentional classroom assignments policies such as looping may have even larger benets.
In-person tutoring programs can have large impacts on K-12 student achievement, but high program costs and limited local supply of tutors have hampered scale-up. Online tutoring provided by volunteers can potentially reach more students in need. We implemented a randomized pilot program of online tutoring that paired college volunteers with middle school students. We estimate consistently positive but statistically insignificant effects on student achievement, 0.07s for math and 0.04s for reading. While our estimated effects are smaller than those for many higher-dosage in-person programs, they are from a significantly lower-cost program delivered within the challenging context of the COVID-19 pandemic.
Economic downturns can cause major funding shortfalls for U.S. public schools, often forcing districts to make difficult budget cuts including teacher layoffs. In this brief, we synthesize the empirical literature on the widespread teacher layoffs caused by the Great Recession. Studies find that teacher layoffs harmed student achievement and were inequitably distributed across schools, teachers, and students. Research suggests that specific elements of the layoff process can exacerbate these negative effects. Seniority-based policies disproportionately concentrate layoffs among teachers of color who are more likely to be early career teachers. These “last-in first-out” policies also disproportionately affect disadvantaged students because these students are more likely to be taught by early career teachers. The common practice of widely distributing pink slips warning about a potential job loss also appears to increase teacher churn and negatively impact teacher performance. Drawing on this evidence, we outline a set of policy recommendations to minimize the need for teacher layoffs during economic downturns and ensure that the burden of any unavoidable job cuts does not continue to be borne by students of color and students from low-income backgrounds.
Numerous high-profile efforts have sought to “turn around” low-performing schools. Evidence of these programs’ effectiveness, however, is mixed, and research offers little guidance on which types of turnaround models are more likely to succeed. We present a case study of turnaround efforts led by the Blueprint Schools Network in three schools in Boston. Using a difference-in-differences framework, we find that Blueprint raised student achievement in mathematics and ELA by at least a quarter of a standard deviation, on average. We document qualitatively how differential impacts across the three Blueprint schools relate to contextual and implementation factors. In particular, Blueprint’s role as a turnaround partner (in two schools) versus school operator (in one school) shaped its ability to implement its model. As a partner, Blueprint provided expertise and guidance but had limited ability to fully implement its model. In its role as an operator, Blueprint had full authority to implement its turnaround model, but was also responsible for managing the day-to-day operations of the school, a role for which it had limited prior experience.
We study the adoption and implementation of a new mobile communication app among a sample of 132 New York City public schools. The app provides a platform for sharing general announcements and news as well as engaging in personalized two-way communication with individual parents. We provide participating schools with free access to the app and randomize schools to receive intensive support (training, guidance, monitoring, and encouragement) for maximizing the efficacy of the app. Although user supports led to higher levels of communication within the app in the treatment year, overall usage remained low and declined in the following year when treatment schools no longer received intensive supports. We find few subsequent effects on perceptions of communication quality or student outcomes. We leverage rich internal user data to explore how take-up and usage patterns varied across staff and school characteristics. These analyses help to identify early adopters and reluctant users, revealing both opportunities and obstacles to engaging parents through new communication technology.
A core motivation for the widespread teacher evaluation reforms of the last decade was the belief that these new systems would promote teacher development through high-quality feedback. We examine this theory by studying teachers’ perceptions of evaluation feedback in Boston Public Schools and evaluating the district’s efforts to improve feedback through an administrator training program. Teachers generally reported that evaluators were trustworthy, fair, and accurate, but that they struggled to provide high-quality feedback. We find little evidence the training program improved perceived feedback quality, classroom instruction, teacher self-efficacy, or student achievement. Our results illustrate the challenges of using evaluation systems as engines for professional growth when administrators lack the time and skill necessary to provide frequent, high-quality feedback.
In this thought experiment, we explore how to make access to individualized instruction and academic mentoring more equitable by taking tutoring to scale as a permanent feature of the U.S. public education system. We first synthesize the tutoring and mentoring literature and characterize the landscape of existing tutoring programs. We then outline a blueprint for integrating federally-funded and locally-delivered tutoring into the school day. High school students would serve as tutors/mentors in elementary schools via an elective class, college students in middle schools via federal work-study, and 2- and 4-year college graduates in high schools via AmeriCorps. We envision an incremental, demand-driven expansion process with priority given to high-needs schools. Our blueprint highlights a range of design tradeoffs, implementation challenges, and program costs. We estimate that targeted approaches to scaling school-wide tutoring nationally, such as focusing on K-8 Title I schools, would cost between $5 and $16 billion annually.
COVID-19 shuttered schools across the United States, upending traditional approaches to education. We examine teachers’ experiences during emergency remote teaching in the spring of 2020 using responses to a working conditions survey from a sample of 7,841 teachers across 206 schools and 9 states. Teachers reported a range of challenges related to engaging students in remote learning and balancing their professional and personal responsibilities. Teachers in high-poverty and majority Black schools perceived these challenges to be the most severe, suggesting the pandemic further increased existing educational inequities. Using data from both pre-post and retrospective surveys, we find that the pandemic and pivot to emergency remote teaching resulted in a sudden, large drop in teachers’ sense of success. We also demonstrate how supportive working conditions in schools played a critical role in helping teachers to sustain their sense of success. Teachers who could depend on their district and school-based leadership for strong communication, targeted training, meaningful collaboration, fair expectations, and recognition of their efforts were least likely to experience declines in their sense of success.
Narrative accounts of classroom instruction suggest that external interruptions, such as intercom announcements and visits from staff, are a regular occurrence in U.S. public schools. We study the frequency, nature, and duration of external interruptions in the Providence Public School District (PPSD) using original data from a district-wide survey and classroom observations. We estimate that a typical classroom in PPSD is interrupted over 2,000 times per year, and that these interruptions and the disruptions they cause result in the loss of between 10 to 20 days of instructional time. Administrators appear to systematically underestimate the frequency and negative consequences of these interruptions. We propose several organizational approaches schools might adopt to reduce external interruptions to classroom instruction.
Educators, I have a request. When you are finally able to return to your classroom this fall—or whenever it’s possible—keep a tally of every time learning is disrupted by interruptions coming from outside your class. Keep note: How often do you have to pause instruction because of intercom announcements, calls to the classroom phone, and teachers, administrators and staff knocking at your door? Five, ten—even 20 times a day?
Researchers commonly interpret effect sizes by applying benchmarks proposed by Cohen over a half century ago. However, effects that are small by Cohen’s standards are large relative to the impacts of most field-based interventions. These benchmarks also fail to consider important differences in study features, program costs, and scalability. In this paper, I present five broad guidelines for interpreting effect sizes that are applicable across the social sciences. I then propose a more structured schema with new empirical benchmarks for interpreting a specific class of studies: causal research on education interventions with standardized achievement outcomes. Together, these tools provide a practical approach for incorporating study features, cost, and scalability into the process of interpreting the policy importance of effect sizes.
This paper describes and evaluates a web-based coaching program designed to support teachers in implementing Common Core-aligned math instruction. Web-based coaching programs can be operated at relatively lower costs, are scalable, and make it more feasible to pair teachers with coaches who have expertise in their content area and grade level. Results from our randomized field trial document sizable and sustained effects on both teachers’ ability to analyze instruction and on their instructional practice, as measured the Mathematical Quality of Instruction (MQI) instrument and student surveys. However, these improvements in instruction did not result in corresponding increases in math test scores as measured by state standardized tests or interim assessments. We discuss several possible explanations for this pattern of results.
We examine the dynamic nature of teacher skill development using panel data on principals’ subjective performance ratings of teachers. Past research on teacher productivity improvement has focused primarily on one important but narrow measure of performance: teachers’ value-added to student achievement on standardized tests. Unlike value-added, subjective performance ratings provide detailed information about specific skill dimensions and are available for the many teachers in non-tested grades and subjects. Using a within-teacher returns to experience framework, we find, on average, large and rapid improvements in teachers’ instructional practices throughout their first ten years on the job as well as substantial differences in improvement rates across individual teachers. We also document that subjective performance ratings contain important information about teacher effectiveness. In the district we study, principals appear to differentiate teacher performance throughout the full distribution instead of just in the tails. Furthermore, prior performance ratings and gains in these ratings provide additional information about teachers’ ability to improve test scores that is not captured by prior value-added scores. Taken together, our study provides new insights on teacher performance improvement and variation in teacher development across instructional skills and individual teachers.
In recent years, states have sought to increase accountability for public school teachers by implementing a package of reforms centered on high-stakes evaluation systems. We examine the effect of these reforms on the supply and quality of new teachers. Leveraging variation across states and time, we find that accountability reforms reduced the number of newly licensed teacher candidates and increased the likelihood of unfilled teaching positions, particularly in hard-to-staff schools. Evidence also suggests that reforms increased the quality of new labor supply by reducing the likelihood new teachers attended unselective undergraduate institutions. Decreases in job security, satisfaction, and autonomy are likely mechanisms for these effects.
Over the past 15 years, the education research community has advocated for the application of more rigorous research designs that support causal inferences, for research that provides more generalizable results across settings, and for the value of research-practice partnerships that inform the design of local programs and policies. However, these goals are often in tension with each other. We propose a research design – the multi-cohort, longitudinal experimental (MCLE) design – as one approach to balancing these competing goals of high-quality research. We illustrate the uses and benefits of MCLEs with an example from a research-practice partnership aimed at evaluating the effect of a teacher coaching program. We find that the coaching program failed to replicate its effectiveness with an initial cohort, likely due to changes in personnel, duration, and content. Our analyses can help researchers weigh the tradeoffs of different design features of MCLEs.
I exploit the random assignment of class rosters in the MET Project to estimate teacher effects on students’ performance on complex open-ended tasks in math and reading, as well as their growth mindset, grit, and effort in class. I find large teacher effects across this expanded set of outcomes, but weak relationships between these effects and performance measures used in current teacher evaluation systems including value-added to state standardized tests. These findings suggest teacher effectiveness is multidimensional, and high-stakes evaluation decisions are only weakly informed by the degree to which teachers are developing students’ complex cognitive skills and social-emotional competencies.
Bush’s and Obama’s federal education reforms were remarkably similar in their goals and ambitions. Bush’s No Child Left Behind (NCLB) Act and Obama’s Race to the Top (RTTT) and NCLB state waiver programs leveraged federal funding and authority to address four broad areas: academic standards, data and accountability, teacher quality, and school turnarounds. This chapter focuses specifically on how these efforts have influenced the teaching profession. During Bush’s and Obama’s combined sixteen years in office, the federal government succeeded in fundamentally changing licensure requirements and evaluation systems for public school teachers. Reflecting on the successes and failures of these reforms provides important lessons about the potential and limitations of federal policy as a tool for improving the quality of the US teacher workforce.
The Education Department at Brown University invites applications for a post-doctoral research associate. The post-doctoral associate will participate in collaborative research activities with Matthew Kraft and John Papay on issues related to teacher effectiveness and the teacher labor market in U.S. K-12 public schools. Research activities will focus broadly on questions of teachers’ development throughout the career and the influence of contextual factors in teacher effectiveness.
Post-doctoral associates will receive mentorship and training from the faculty sponsors,...