Skip to:

Accountability

  • DC School Growth Scores And Poverty

    Written on July 15, 2013

    As noted in a nice little post over at Greater Greater Washington's education blog, the District of Columbia Office of the State Superintendent of Education (OSSE) recently started releasing growth model scores for DC’s charter and regular public schools. These models, in a nutshell, assess schools by following their students over time and gauging their testing progress relative to similar students (they can also be used for individual teachers, but DCPS uses a different model in its teacher evaluations).

    In my opinion, producing these estimates and making them available publicly is a good idea, and definitely preferable to the district’s previous reliance on changes in proficiency, which are truly awful measures (see here for more on this). It’s also, however, important to note that the model chosen by OSSE – a “median growth percentile," or MGP model, produces estimates that have been shown to be at least somewhat more heavily associated with student characteristics than other types of models, such as value-added models proper. This does not necessarily mean the growth percentile models are “inaccurate” – there are good reasons, such as resources and more difficulty with teacher recruitment/retention, to believe that schools serving poorer students might be less effective, on average, and it’s tough to separate “real” effects from bias in the models.

    That said, let’s take a quick look at this relationship using the DC MGP scores from 2011, with poverty data from the National Center for Education Statistics.

    READ MORE
  • The FCAT Writing, On The Wall

    Written on May 28, 2013

    The annual release of state testing data makes the news in every state, but Florida is one of those places where it is to some degree a national story.*

    Well, it’s getting to be that time of year again. Last week, the state released its writing exam (FCAT 2.0 Writing) results for 2013 (as well as the math and reading results for third graders only).  The Florida Department of Education (FLDOE) press release noted: “With significant gains in writing scores, Florida’s teachers and students continue to show that higher expectations and support at home and in the classroom enable every child to succeed.” This interpretation of the data was generally repeated without scrutiny in the press coverage of the results.

    Putting aside the fact that the press release incorrectly calls the year-to-year changes “gains” (they are actually comparisons of two different groups of students; see here), the FLDOE's presentation of the FCAT Writing results, though common, is, at best, incomplete and, at worst, misleading. Moreover, the important issues in this case are applicable in all states, and unusually easy to illustrate using the simple data released to the public.

    READ MORE
  • School Choice And Segregation In Charter And Regular Public Schools

    Written on February 25, 2013

    A recent article in Reuters, one that received a great deal of attention, sheds light on practices that some charter schools are using essentially to screen students who apply for admission. These policies include requiring long and difficult applications, family interviews, parental contracts, and even demonstrations of past academic performance.

    It remains unclear how common these practices might be in the grand scheme of things, but regardless of how frequently they occur, most of these tactics are terrible, perhaps even illegal, and should be stopped. At the same time, there are two side points to keep in mind when you hear about charges such as these, as well as the accusations (and denials) of charter exclusion and segregation that tend to follow.

    The first is that some degree of (self-)sorting and segregation of students by abilities, interests and other characteristics is part of the deal in a choice-based system. The second point is that screening and segregation are most certainly not unique to charter/private schools, and one primary reason is that there is, in a sense, already a lot of choice among regular public schools.

    READ MORE
  • Why Did Florida Schools' Grades Improve Dramatically Between 1999 and 2005?

    Written on February 11, 2013

    ** Reprinted here in the Washington Post

    Former Florida Governor Jeb Bush was in Virginia last week, helping push for a new law that would install an “A-F” grading system for all public schools in the commonwealth, similar to a system that has existed in Florida for well over a decade.

    In making his case, Governor Bush put forth an argument about the Florida system that he and his supporters use frequently. He said that, right after the grades went into place in his state, there was a drop in the proportion of D and F schools, along with a huge concurrent increase in the proportion of A schools. For example, as Governor Bush notes, in 1999, only 12 percent of schools got A's. In 2005, when he left office, the figure was 53 percent. The clear implication: It was the grading of schools (and the incentives attached to the grades) that caused the improvements.

    There is some pretty good evidence (also here) that the accountability pressure of Florida’s grading system generated modest increases in testing performance among students in schools receiving F's (i.e., an outcome to which consequences were attached), and perhaps higher-rated schools as well. However, putting aside the serious confusion about what Florida’s grades actually measure, as well as the incorrect premise that we can evaluate a grading policy's effect by looking at the simple distribution of those grades over time, there’s a much deeper problem here: The grades changed in part because the criteria changed.

    READ MORE
  • A Few Quick Fixes For School Accountability Systems

    Written on February 5, 2013

    Our guest authors today are Morgan Polikoff and Andrew McEachin. Morgan is Assistant Professor in the Rossier School of Education at the University of Southern California. Andrew is an Institute of Education Science postdoctoral fellow at the University of Virginia.

    In a previous post, we described some of the problems with the Senate's Harkin-Enzi plan for reauthorizing the No Child Left Behind Act, based on our own analyses, which yielded three main findings. First, selecting the bottom 5% of schools for intervention based on changes in California’s composite achievement index resulted in remarkably unstable rankings. Second, identifying the bottom 5% based on schools' lowest performing subgroup overwhelmingly targeted those serving larger numbers of special education students. Third and finally, we found evidence that middle and high schools were more likely to be identified than elementary schools, and smaller schools more likely than larger schools.

    None of these findings was especially surprising (see here and here, for instance), and could easily have been anticipated. Thus, we argued that policymakers need to pay more attention to the vast (and rapidly expanding) literature on accountability system design.

    READ MORE
  • A Case Against Assigning Single Ratings To Schools

    Written on November 26, 2012

    The new breed of school rating systems, some of which are still getting off the ground, will co-exist with federal proficiency targets in many states, and they are (or will be) used for a variety of purposes, including closure, resource allocation and informing parents and the public (see our posts on the systems in INFLOHCONYC).*

    The approach that most states are using, in part due to the "ESEA flexibility" guidelines set by the U.S. Department of Education, is to combine different types of measures, often very crudely, into a single grade or categorical rating for each school. Administrators and media coverage usually characterize these ratings as measures of school performance - low-rated schools are called "low performing," while those receiving top ratings are characterized as "high performing." That's not accurate - or, at best, it's only partially true.

    Some of the indicators that comprise the ratings, such as proficiency rates, are best interpreted as (imperfectly) describing student performance on tests, whereas other measures, such as growth model estimates, make some attempt to isolate schools’ contribution to that performance. Both might have a role to play in accountability systems, but they're more or less appropriate depending on how you’re trying to use them.

    So, here’s my question: Why do we insist on throwing them all together into a single rating for each school? To illustrate why I think this question needs to be addressed, let’s take a quick look at four highly-simplified situations in which one might use ratings.

    READ MORE
  • When You Hear Claims That Policies Are Working, Read The Fine Print

    Written on November 19, 2012

    When I point out that raw changes in state proficiency rates or NAEP scores are not valid evidence that a policy or set of policies is “working," I often get the following response: “Oh Matt, we can’t have a randomized trial or peer-reviewed article for everything. We have to make decisions and conclusions based on imperfect information sometimes."

    This statement is obviously true. In this case, however, it's also a straw man. There’s a huge middle ground between the highest-quality research and the kind of speculation that often drives our education debate. I’m not saying we always need experiments or highly complex analyses to guide policy decisions (though, in general, these are always preferred and sometimes required). The point, rather, is that we shouldn’t draw conclusions based on evidence that doesn't support those conclusions.

    This, unfortunately, happens all the time. In fact, many of the more prominent advocates in education today make their cases based largely on raw changes in outcomes immediately after (or sometimes even before) their preferred policies were implemented (also see hereherehereherehere, and here). In order to illustrate the monumental assumptions upon which these and similar claims ride, I thought it might be fun to break them down quickly, in a highly simplified fashion. So, here are the four “requirements” that must be met in order to attribute raw test score changes to a specific policy (note that most of this can be applied not only to claims that policies are working, but also to claims that they're not working because scores or rates are flat):

    READ MORE
  • The Structural Curve In Indiana's New School Grading System

    Written on November 1, 2012

    The State of Indiana has received a great deal of attention for its education reform efforts, and they recently announced the details, as well as the first round of results, of their new "A-F" school grading system. As in many other states, for elementary and middle schools, the grades are based entirely on math and reading test scores.

    It is probably the most rudimentary scoring system I've seen yet - almost painfully so. Such simplicity carries both potential advantages (easier for stakeholders to understand) and disadvantages (school performance is complex and not always amenable to rudimentary calculation).

    In addition, unlike the other systems that I have reviewed here, this one does not rely on explicit “weights," (i.e., specific percentages are not assigned to each component). Rather, there’s a rubric that combines absolute performance (passage rates) and proportions drawn from growth models (a few other states use similar schemes, but I haven't reviewed any of them).

    On the whole, though, it's a somewhat simplistic variation on the general approach most other states are taking -- but with a few twists.

    READ MORE
  • The Stability And Fairness Of New York City's School Ratings

    Written on October 8, 2012

    New York City has just released the new round of results from its school rating system (they're called “progress reports"). It relies considerably more on student growth (60 out of 100 points) than absolute performance (25 points), and there are efforts to partially adjust most of the measures via peer group comparisons.*

    All of this indicates that the city's system is more focused on school rather than student test-based performance, compared with many other systems around the U.S.

    The ratings are high-stakes. Schools receiving low grades – a D or F in any given year, or a C for three consecutive years – enter a review process by which they might be closed. The number of schools meeting these criteria jumped considerably this year.

    There is plenty of controversy to go around about the NYC ratings, much of it pertaining to two important features of the system. They’re worth discussing briefly, as they are also applicable to systems in other states.

    READ MORE
  • Does It Matter How We Measure Schools' Test-Based Performance?

    Written on September 19, 2012

    In education policy debates, we like the "big picture." We love to say things like “hold schools accountable” and “set high expectations." Much less frequent are substantive discussions about the details of accountability systems, but it’s these details that make or break policy. The technical specs just aren’t that sexy. But even the best ideas with the sexiest catchphrases won’t improve things a bit unless they’re designed and executed well.

    In this vein, I want to recommend a very interesting CALDER working paper by Mark Ehlert, Cory Koedel, Eric Parsons and Michael Podgursky. The paper takes a quick look at one of these extremely important, yet frequently under-discussed details in school (and teacher) accountability systems: The choice of growth model.

    When value-added or other growth models come up in our debates, they’re usually discussed en masse, as if they’re all the same. They’re not. It's well-known (though perhaps overstated) that different models can, in many cases, lead to different conclusions for the same school or teacher. This paper, which focuses on school-level models but might easily be extended to teacher evaluations as well, helps illustrate this point in a policy-relevant manner.

    READ MORE

Pages

Subscribe to Accountability

DISCLAIMER

This web site and the information contained herein are provided as a service to those who are interested in the work of the Albert Shanker Institute (ASI). ASI makes no warranties, either express or implied, concerning the information contained on or linked from shankerblog.org. The visitor uses the information provided herein at his/her own risk. ASI, its officers, board members, agents, and employees specifically disclaim any and all liability from damages which may result from the utilization of the information provided herein. The content in the Shanker Blog may not necessarily reflect the views or official policy positions of ASI or any related entity or organization.