Skip to:

Accountability

  • Thinking About Tests While Rethinking Test-Based Accountability

    Written on August 5, 2016

    Earlier this week, per the late summer ritual, New York State released its testing results for the 2015-2016 school year. New York City (NYC), always the most closely watched set of results in the state, showed a 7.6 percentage point increase in its ELA proficiency rate, along with a 1.2 percentage point increase in its math rate. These increases were roughly equivalent to the statewide changes.

    City officials were quick to pounce on the results, which were called “historic,” and “pure hard evidence” that the city’s new education policies are working. This interpretation, while standard in the U.S. education debate, is, of course, inappropriate for many reasons, all of which we’ve discussed here countless times and will not detail again (see here). Suffice it to say that even under the best of circumstances these changes in proficiency rates are only very tentative evidence that students improved their performance over time, to say nothing of whether that improvement was due to a specific policy or set of policies.

    Still, the results represent good news. A larger proportion of NYC students are scoring proficient in math and ELA than did last year. Real improvement is slow and sustained, and this is improvement. In addition, the proficiency rate in NYC is now on par with the statewide rate, which is unprecedented. There are, however, a couple of additional issues with these results that are worth discussing quickly.

    READ MORE
  • A Small But Meaningful Change In Florida's School Grades System

    Written on July 28, 2016

    Beginning in the late 1990s, Florida became one of the first states to assign performance ratings to public schools. The purpose of these ratings, which are in the form of A-F grades, is to communicate to the public “how schools are performing relative to state standards.” For elementary and middle schools, the grades are based entirely on standardized testing results.

    We have written extensively here about Florida’s school grading system (see here for just one example), and have used it to illustrate features that can be found in most other states’ school ratings. The primary issue is the heavy reliance that states place on how highly students score on tests, which tells you more about the students the schools serve than about how well they serve those students – i.e., it conflates school and student performance. Put simply, some schools exhibit lower absolute testing performance levels than do other schools, largely because their students enter performing at lower levels. As a result, schools in poorer neighborhoods tend to receive lower grades, even though many of these schools are very successful in helping their students make fast progress during their few short years of attendance.

    Although virtually every states’ school rating system has this same basic structure to varying degrees, Florida’s system warrants special attention, as it was one of the first in the nation and has been widely touted and copied (as well as researched -- see our policy brief for a review of this evidence). It is also noteworthy because it contains a couple of interesting features, one of which exacerbates the aforementioned conflation of student and school performance in a largely unnoticed manner. But, this feature, discussed below, has just been changed by the Florida Department of Education (FLDOE). This correction merits discussion, as it may be a sign of improvement in how policymakers think about these systems.

    READ MORE
  • Teachers' Opinions Of Teacher Evaluation Systems

    Written on June 17, 2016

    The primary test of the new teacher evaluation systems implemented throughout the nation over the past 5-10 years is whether they improve teacher and ultimately student performance. Although the kinds of policy evaluations that will address these critical questions are just beginning to surface (e.g., Dee and Wyckoff 2015), among the most important early indicators of how well the new systems are working is their credibility among educators. Put simply, if teachers and administrators don’t believe in the systems, they are unlikely to respond productively to them.

    A new report from the Institute of Education Sciences (IES) provides a useful little snapshot of teachers’ opinions of their evaluation systems using a nationally representative survey. It is important to bear in mind that the data are from the 2011-12 Schools and Staffing Survey (SASS) and the 2012-13 Teacher Follow Up Survey, a time in which most of the new evaluations in force today were either still on the drawing board, or in their first year or two of implementation. But the results reported by IES might still serve as a useful baseline going forward.

    The primary outcome in this particular analysis is a survey item querying whether teachers were “satisfied” with their evaluation process. And almost four in five respondents either strongly or somewhat agreed that they were satisfied with their evaluation. Of course, satisfaction with an evaluation system does not necessarily signal anything about its potential to improve or capture teacher performance, but it certainly tells us something about teachers’ overall views of how they are evaluated.

    READ MORE
  • Getting Serious About Measuring Collaborative Teacher Practice

    Written on April 8, 2016

    Our guest author today is Nathan D. Jones, an assistant professor of special education at Boston University. His research focuses on teacher quality, teacher development, and school improvement. Dr. Jones previously worked as a middle school special education teacher in the Mississippi Delta. In this column, he introduces a new Albert Shanker Institute publication, which was written with colleagues Elizabeth Bettini and Mary Brownell.

    The current policy landscape presents a dilemma. Teacher evaluation has dominated recent state and local reform efforts, resulting in broad changes in teacher evaluation systems nationwide. The reforms have spawned countless research studies on whether emerging evaluation systems use measures that are reliable and valid, whether they result in changes in how teachers are rated, what happens to teachers who receive particularly high or low ratings, and whether the net results of these changes have had an effect on student learning.

    At the same time,  there has been increasing enthusiasm about the promise of teacher collaboration (see here and here), spurred in part by new empirical evidence linking teacher collaboration to student outcomes (see Goddard et al., 2007; Ronfeldt, 2015; Sun, Grissom, & Loeb, 2016). When teachers work together, such as when they jointly analyze student achievement data (Gallimore et al., 2009; Saunders, Gollenberg, & Gallimore, 2009) or when high-performing teachers are matched with low-performing peers (Papay, Taylor, Tyler, & Laski, 2016), students have shown substantially better growth on standardized tests.

    This new work adds to a long line of descriptive research on the importance of colleagues and other social aspects of the school organization.  Research has documented that informal relationships with colleagues play an important role in promoting positive teacher outcomes, such as planned and actual retention decisions (e.g., Bryk & Schneider, 2002; Pogodzisnki, Youngs, & Frank, 2013; Youngs, Pogodzinski, Grogan, & Perrone, 2015). Further, a number of initiatives aimed at improving teacher learning – e.g., professional learning communities (Giles & Hargreaves, 2006) and lesson study (Lewis, Perry, & Murrata, 2006) – rely on teachers planning instruction collaboratively.

    READ MORE
  • Evaluating The Results Of New Teacher Evaluation Systems

    Written on March 24, 2016

    A new working paper by researchers Matthew Kraft and Allison Gilmour presents a useful summary of teacher evaluation results in 19 states, all of which designed and implemented new evaluation systems at some point over the past five years. As with previous evaluation results, the headline result of this paper is that only a small proportion of teachers (2-5 percent) were given the low, “below proficiency” ratings under the new systems, and the vast majority of teachers continue to be rated as satisfactory or better.

    Kraft and Gilmour present their results in the context of the “Widget Effect,” a well-known 2009 report by the New Teacher Project showing that the overwhelming majority of teachers in the 12 districts for which they had data received “satisfactory” ratings. The more recent results from Kraft and Gilmour indicate that this hasn’t changed much due to the adoption of new evaluation systems, or, at least, not enough to satisfy some policymakers and commentators who read the paper.

    The paper also presents a set of findings from surveys of and interviews with observers (e.g., principals). These are in many respects more interesting and important results from a research and policy perspective, but let’s nevertheless focus a bit on the findings on the distribution of teachers across rating categories, as they caused a bit of a stir. I have several comments to make about them, but will concentrate on three in particular (all of which, by the way, pertain not to the paper’s discussion, which is cautious and thorough, but rather to some of the reaction to it in our education policy discourse).

    READ MORE
  • Student Sorting And Teacher Classroom Observations

    Written on February 25, 2016

    Although value added and other growth models tend to be the focus of debates surrounding new teacher evaluation systems, the widely known but frequently unacknowledged reality is that most teachers don’t teach in the tested grades and subjects, and won’t even receive these test-based scores. The quality and impact of the new systems therefore will depend heavily upon the quality and impact of other measures, primarily classroom observations.

    These systems have been in use for decades, and yet, until recently, relatively little is known about their properties, such as their association with student and teacher characteristics, and there are, as yet, only a handful of studies of their impact on teachers’ performance (e.g., Taylor and Tyler 2012). The Measures of Effective Teaching (MET) Project, conducted a few years ago, was a huge step forward in this area, though at the time it was perhaps underappreciated the degree to which MET’s contribution was not just in the (very important) reports it produced, but also in its having collected an extensive dataset for researchers to use going forward. A new paper, just published in Educational Evaluation and Policy Analysis, is among the many analyses that have and will use MET data to address important questions surrounding teacher evaluation.

    The authors, Rachel Garrett and Matthew Steinberg, look at classroom observation scores, specifically those from Charlotte Danielson’s widely employed Framework for Teaching (FFT) protocol. These results are yet another example of how observation scores share most of the widely-cited (statistical) criticisms of value added scores, most notably their sensitivity to which students are assigned to teachers.

    READ MORE
  • Beyond Teacher Quality

    Written on February 23, 2016

    Beyond PD: Teacher Professional Learning in High-Performing Systems is a recent report from the Learning First Alliance and the International Center for Benchmarking in Education at the National Center for Education and the Economy. The paper describes practices and policies from four high-performing school systems – British Columbia, Hong Kong, Shanghai, and Singapore – where professional learning is believed to be the primary vehicle for school improvement.

    My first reaction was: This sounds great, but where is the ubiquitous discussion of “teacher quality?” Frankly, I was somewhat baffled that a report on school improvement never even mentioned the phrase.* Upon close reading, I found the report to be full of radical (and very good) ideas. It’s not that the report proposed anything that would require an overhaul of the U.S. education system; rather, they were groundbreaking because these ideas did not rely on the typical assumptions about how the youth or the adults in these systems learn and achieve mastery. Because, while things are changing a bit in the U.S. with regard to our understanding of student learning – e.g., we now talk about “deep learning” – we have still not made this transition when it comes to teachers.

    In the U.S., a number of unstated but common assumptions about “teacher quality” suffuse the entire school improvement conversation. As researchers have noted (see here and here), instructional effectiveness is implicitly viewed as an attribute of individuals, a quality that exists in a sort of vacuum (or independent of the context of teachers’ work), and which, as a result, teachers can carry with them, across and between schools. Effectiveness also is often perceived as fairly stable: teachers learn their craft within the first few years in the classroom and then plateau,** but, at the end of the day, some teachers have what it takes and others just don’t. So, the general assumption is that a “good teacher” will be effective under any conditions, and the quality of a given school is determined by how many individual “good teachers” it has acquired.

    READ MORE
  • Evidence From A Teacher Evaluation Pilot Program In Chicago

    Written on December 4, 2015

    The majority of U.S. states have adopted new teacher evaluation systems over the past 5-10 years. Although these new systems remain among the most contentious issues in education policy today, there is still only minimal evidence on their impact on student performance or other outcomes. This is largely because good research takes time.

    A new article, published in the journal Education Finance and Policy, is among the handful of analyses examining the preliminary impact of teacher evaluation systems. The researchers, Matthew Steinberg and Lauren Sartain, take a look at the Excellence in Teaching Project (EITP), a pilot program carried out in Chicago Public Schools starting in the 2008-09 school year. A total of 44 elementary schools participated in EITP in the first year (cohort 1), while an additional 49 schools (cohort 2) implemented the new evaluation systems the following year (2009-10). Participating schools were randomly selected, which permits researchers to gauge the impact of the evaluations experimentally.

    The results of this study are important in themselves, and they also suggest some more general points about new teacher evaluations and the building body of evidence surrounding them.

    READ MORE
  • Where Al Shanker Stood: The Importance And Meaning Of NAEP Results

    Written on October 30, 2015

    In this New York Times piece, published on July 29, 1990, Al Shanker discusses the results of the National Assessment of Educational Progress (NAEP), and what they suggested about the U.S. education system at the time.

    One of the things that has influenced me most strongly to call for radical school reform has been the results of the National Assessment of Educational Progress (NAEP) examinations. These exams have been testing the achievement of our 9, 13 and 17-year olds in a number of basic areas over the past 20 years, and the results have been almost uniformly dismal.

    According to NAEP results, no 17-year-olds who are still in school are illiterate and innumerate - that is, all of them can read the words you would find on a cereal box or a billboard, and they can do simple arithmetic. But very few achieve what a reasonable person would call competence in reading, writing or computing.

    For example, NAEP's 20-year overview, Crossroads in American Education, indicated that only 2.6 percent of 17-year-olds taking the test could write a good letter to a high school principal about why a rule should be changed. And when I say good, I'm talking about a straightforward presentation of a couple of simple points. Only 5 percent could grasp a paragraph as complicated as the kind you would find in a first-year college textbook. And only 6 percent could solve a multi-step math problem like this one:"Christine borrowed $850 for one year from Friendly Finance Company. If she paid 12% simple interest on the loan, what was the total amount she repaid?"

    READ MORE
  • The Magic Of Multiple Measures

    Written on August 6, 2015

    Our guest author today is Cara Jackson, Assistant Director of Research and Evaluation at the Urban Teacher Center.

    Teacher evaluation has become a contentious issue in U.S.  Some observers see the primary purpose of these reforms as the identification and removal of ineffective teachers; the popular media as well as politicians and education reform advocates have all played a role in the framing of teacher evaluation as such.  But, while removal of ineffective teachers was a criterion under Race to the Top, so too was the creation of evaluation systems to be used for teacher development and support.

    I think most people would agree that teacher development and improvement should be the primary purpose, as argued here.  Some empirical evidence supports the efficacy of evaluation for this purpose (see here).  And given the sheer number of teachers we need, declining enrollment in teacher preparation programs, and the difficulty disadvantaged schools have retaining teachers, school principals are probably none too enthusiastic about dismissing teachers, as discussed here.

    Of course, to achieve the ambitious goal of improving teaching practice, an evaluation system must be implemented well.  Fans of Harry Potter might remember when Dolores Umbridge from the Ministry of Magic takes over as High Inquisitor at Hogwarts and conducted “inspections” of Hogwart’s teachers in Book 5 of J.K. Rowling’s series.  These inspections pretty much demonstrate how not to approach classroom observations: she dictates the timing, fails to provide any of indication of what aspects of teaching practice she will be evaluating, interrupts lessons with pointed questions and comments, and evidently does no pre- or post-conferencing with the teachers. 

    READ MORE

Pages

Subscribe to Accountability

DISCLAIMER

This web site and the information contained herein are provided as a service to those who are interested in the work of the Albert Shanker Institute (ASI). ASI makes no warranties, either express or implied, concerning the information contained on or linked from shankerblog.org. The visitor uses the information provided herein at his/her own risk. ASI, its officers, board members, agents, and employees specifically disclaim any and all liability from damages which may result from the utilization of the information provided herein. The content in the Shanker Blog may not necessarily reflect the views or official policy positions of ASI or any related entity or organization.