In recent years, states have sought to increase accountability for public school teachers by implementing high-stakes evaluation systems. We examine the effect of these reforms on the supply and quality of new teachers. Leveraging variation across states and time, we find that evaluation reforms reduced the supply of new teaching candidates by 17 percent and increased the likelihood of unfilled teaching positions, particularly in hard-to-staff schools. Reforms also increased the quality of newly hired teachers by shifting the lower tail of the distribution upward. We find evidence that decreased job security, satisfaction, and autonomy are likely mechanisms for these effects.
In recent years, states across the country have attempted to increase the accountability of public school teachers by implementing rigorous, high-stakes evaluation systems and in some cases repealing teacher tenure protections. We examine the effect of these reforms on the supply of new entrants into the teacher labor market by exploiting a unique panel dataset that includes the number of teaching licenses granted by states. Leveraging variation in the adoption of reforms across states and time, we find that evaluation reforms resulted in a steady decline in the statewide supply of new teachers, whereas tenure reforms produced a sharp but more temporary contraction. In exploratory analyses, we find mixed evidence of the effect of accountability on the selectivity of the institutions where prospective teachers earned their teaching degrees. There is little evidence evaluation reforms had any differential effect by university selectivity, while tenure reforms appear to have reduced supply more among candidates from less selective universities. We find no evidence that decreases in labor supply were concentrated in non-shortage or shortage licensure areas.
I exploit the random assignment of class rosters in the MET Project to estimate teacher effects on students’ performance on complex open-ended tasks in math and reading, as well as their growth mindset, grit, and effort in class. I find large teacher effects across this expanded set of outcomes, but weak relationships between these effects and performance measures used in current teacher evaluation systems including value-added to state standardized tests. These findings suggest teacher effectiveness is multidimensional and high-stakes evaluation decisions are only weakly informed by the degree to which teachers are developing students’ complex cognitive skills and social-emotional competencies.
Teacher coaching has emerged as a promising alternative to traditional models of professional development. We review the empirical literature on teacher coaching and conduct meta-analyses to estimate the mean effect of coaching programs on teachers’ instructional practice and students’ academic achievement. Combining results across 60 studies that employ causal research designs, we find pooled effect sizes of 0.49 standard deviations (SD) on instruction and 0.18 SD on achievement. Much of this evidence comes from literacy coaching programs for pre-kindergarten and elementary school teachers. Although these findings affirm the potential of coaching as a development tool, further analyses illustrate the challenges of taking coaching programs to scale while maintaining effectiveness. Average effects from effectiveness trials of larger programs are only a fraction of the effects found in efficacy trials of smaller programs. We conclude by discussing ways to address scale-up implementation challenges and providing guidance for future causal studies.
The vast differences in summer learning activities among children present a substantial challenge to providing equal educational opportunity in the United States. Most initiatives aimed at reversing summer learning loss focus on school- or center-based programs. This study explores the potential of enabling parents to provide literacy development opportunities at home as a low-cost alternative. We conduct a randomized field trial of a summer text-messaging pilot program for parents focused on promoting literacy skills among first through fourth graders. We find positive effects on reading comprehension among third and fourth graders, with effect sizes of .21 to .29 standard deviations, but no effects for first and second graders. Texts also increased attendance at parent-teacher conferences but not at other school-related activities. Evidence to inform future efforts to reverse summer learning loss is provided by parents’ responses to a follow-up survey.
In recent years, states and districts have responded to federal incentives by instituting major reforms to their teacher evaluation systems. The passage of the Every Student Succeeds Act in 2015 now provides policymakers with even greater autonomy to redesign existing evaluation systems. Yet, little evidence exists to inform decisions about two key system design features – teacher performance measure weights and performance ratings thresholds. Using data from the Measures of Effective Teaching study, we conduct simulation-based analyses that illustrate the critical role that performance measure weights and ratings thresholds play in determining teachers’ summative evaluation ratings and the distribution of teacher proficiency rates. These findings offer insights to policymakers and administrators as they refine and possibly remake teacher evaluation systems.
Research has focused predominantly on how teachers affect students’ achievement on tests despite evidence that a broad range of attitudes and behaviors are equally important to their long-term success. We find that upper-elementary teachers have large effects on self-reported measures of students’ self-efficacy in math, and happiness and behavior in class. Students’ attitudes and behaviors are predicted by teaching practices most proximal to these measures, including teachers’ emotional support and classroom organization. However, teachers who are effective at improving test scores often are not equally effective at improving students’ attitudes and behaviors. These findings lend empirical evidence to well-established theory on the multidimensional nature of teaching and the need to identify strategies for improving the full range of teachers’ skills.
Teacher teams are increasingly common in urban schools. Here we analyze teachers' responses to teams in six high-poverty schools. Teachers used two criteria to assess teams' "goodness of fit" in meeting the demands of their work—whether their team helped them teach better and whether it contributed to a better school. Their responses differed notably by school, depending largely on the principal's approach to implementation. In the three schools where teachers assessed teams favorably, principals set a meaningful purpose for teachers' collaborative work, contributed structural and professional expertise for their deliberations, and established a safe environment for teachers' on-the-job growth.
In 2009, The New Teacher Project (TNTP)’s The Widget Effect documented the failure to recognize and act on differences in teacher effectiveness. We revisit these findings by compiling teacher performance ratings across 24 states that adopted major reforms to their teacher evaluation systems. In the vast majority of these states, the percentage of teachers rated Unsatisfactory remains less than 1%. However, the full distributions of ratings vary widely across states with 0.7% to 28.7% rated below Proficient and 6% to 62% rated above Proficient. We present original survey data from an urban district illustrating that evaluators perceive more than three times as many teachers in their schools to be below Proficient than they rate as such. Interviews with principals reveal several potential explanations for these patterns.
This paper analyzes a coaching model focused on classroom management skills and instructional practices across grade levels and subject areas. We describe the design and implementation of MATCH Teacher Coaching among an initial cohort of fifty-nine teachers working in New Orleans charter schools. We evaluate the effect of the program on teachers’ instructional practices using a block randomized trial and find that coached teachers scored 0.59 standard deviations higher on an index of effective teaching practices comprised of observation scores, principal evaluations, and student surveys. We discuss implementation challenges and make recommendations for researcher-practitioner partnerships to address key remaining questions.
We used self-report surveys to gather information on a broad set of non-cognitive skills from 1,368 8th-graders. At the student level, scales measuring conscientiousness, self-control, grit, and growth mindset are positively correlated with attendance, behavior, and test-score gains between 4th- and 8th-grade. Conscientiousness, self-control, and grit are unrelated to test-score gains at the school level, however, and students attending over-subscribed charter schools score lower on these scales than do students attending district schools. Exploiting admissions lotteries, we find positive impacts of charter school attendance on achievement and attendance but negative impacts on these non-cognitive skills. We provide suggestive evidence that these paradoxical results are driven by reference bias, or the tendency for survey responses to be influenced by social context.
We use matched employee-employer records from the teacher labor market to explore the trade-offs between the timing of hiring and match quality. Hiring teachers after the school year starts reduces student achievement by 0.042SD in mathematics and 0.026SD in reading. This reflects, in part, a temporary disruption effect in the first year. In mathematics, but not in reading, late-hired teachers remain persistently less effective, evidence of negative selection in the teacher labor market. Late hiring concentrates in schools that disproportionately serve disadvantaged student populations, contributing to challenges in ensuring an equitable distribution of educational resources across students.
Purpose: New teacher evaluation systems have expanded the role of principals as instructional leaders, but little is known about principals’ ability to promote teacher development through the evaluation process. We conducted a case study of principals’ perspectives on evaluation and their experiences implementing observation and feedback cycles to better understand whether principals feel as though they are able to promote teacher development as evaluators.
Research Methods: We conducted interviews with a stratified random sample of 24 principals in an urban district that recently implemented major reforms to its teacher evaluation system. We analyzed these interviews by drafting thematic summaries, coding interview transcripts, creating data-analytic matrices, and writing analytic memos.
Findings: We found that the evaluation reforms provided a common framework and language that helped facilitate principals’ feedback conversations with teachers. However, we also found that tasking principals with primary responsibility for conducting evaluations resulted in a variety of unintended consequences which undercut the quality of evaluation feedback they provided. We analyze five broad solutions to these challenges: strategically targeting evaluations, reducing operational responsibilities, providing principal training, hiring instructional coaches, and developing peer evaluation systems.
Implications: The quality of feedback teachers receive through the evaluation process depends critically on the time and training evaluators have to provide individualized and actionable feedback. Districts that task principals with primary responsibility for conducting observation and feedback cycles must attend to the many implementation challenges associated with this approach in order for next-generation evaluation systems to successfully promote teacher development.
We study the relationship between school organizational contexts, teacher turnover, and student achievement in New York City (NYC) middle schools. Using factor analysis, we construct measures of four distinct dimensions of school contexts captured on the annual NYC School Survey. We identify credible estimates by isolating variation in organizational contexts within schools over time. We find that improvements in school leadership, academic expectations, teacher relationships, and school safety are all independently associated with corresponding reductions in teacher turnover. Increases in school safety and academic expectations for students also correspond to increases in student achievement. These results are robust to a range of potential threats to validity suggesting that our findings are likely driven by an underlying causal relationship.
Although previous research has shown that teacher coaching can improve teaching practices and student achievement, little is known about specific features of effective coaching programs. We estimate the impact of MATCH Teacher Coaching (MTC) on a range of teacher practices using a blocked randomized trial and explore how changes in the coaching model across two cohorts are related to program effects. Findings indicate large positive effects on teachers’ practices in cohort 1 but no effects in cohort 2. After ruling out explanations related to the research design, a set of exploratory analyses suggest that differential treatment effects may be attributable to differences in coach effectiveness and the focus of coaching across cohorts.
Most teacher layoffs during the Great Recession were implemented following inverse-seniority policies. In this paper, I examine the implementation of a discretionary layoff policy in Charlotte Mecklenburg Schools. Administrators did not uniformly lay off the most or least senior teachers but instead selected teachers who were previously retired, late-hired, unlicensed, low-performing, or nontenured. Using quasi-experimental variation within schools across grades, I then estimate the differential effects of teacher layoffs on student achievement based on teacher seniority and effectiveness. Mathematics achievement in grades that lost an effective teacher, as measured by principal evaluations or value-added scores, decreased 0.05 to 0.11 standard deviations more than in grades that lost an ineffective teacher. In contrast, teacher seniority has little predictive power on the effects of layoffs. Simulation analyses show that the district selected teachers who were, on average, less effective than those teachers identified under an inverse-seniority policy, and also reduced job losses.