The DeHavilland Blog

Wednesday, May 09, 2007

Dr. Wheatley takes the stage

In my last post, I offered a rebuttal to comments submitted by Dr. Karl Wheatley, and offered to post a response from him if he cared to provide one. He followed up with a second thoughtful note, and I'm sharing it verbatim below.

I'm catching up from a trip at the moment and haven't had time to craft a response, but I didn't want to withhold his comments any longer. If I decide to respond later, I'll do so in a followup post; I'd like to, and certainly his comments are worthy of a serious response, but I don't know if I can commit the time to do it justice, and I also feel that I've already laid out my thinking in several previous posts on the blog.

Dr. Wheatley's note:

Thanks so much for such a thoughtful response. A few points.

1) I couldn’t agree with you more that schools are often very unresponsive to parents and local community. For example, if my child is in your school, I expect to be welcome to come in and see what you’re doing. Let’s have dramatically more local control—but that means bean counters in D.C. or Columbus can’t be calling all the shots. (Nor can the Business Round Table).

2) Many groups are upset about NCLB, including parents who overwhelmingly think too much testing is going on. Mental health professionals are writing about mental health problems related to increased school stress, pediatricians have been critical—which is part of why they came out with a statement supporting unstructured play. Independent researchers (i.e., not part of DOE, Achieve, Ed Trust, etc.) have been very critical of NCLB—many from the very beginning.

3) I agree—Disaggregated data is great. However, I see no evidence that poor kids are getting any better education now. I teach in America’s poorest major city, and in the panic to somehow pass those tests, many schools are doing some of the most mind-numbing parroting instruction imaginable. For some, they often just practice old tests. This turns education into Trivial Pursuit—cram answers you don’t understand for the test, forget, cram again. It’s a brilliant way to prepare kids for the 1800s, and ensure that poor kids can’t think well enough to compete for good jobs.

4) I teach a lot about competition and competitiveness, especially in my master’s and doctoral courses on motivation. We like to pay competitive prices, want our kids to go to competitive colleges, and “compete” with ourselves to improve ourselves. The first problem with competition and competitive is that those words mean a lot of different things, and we tend to lump it all together, and assume it’s all beneficial. In the real world, the win-lose type of competition has various benefits and costs—sometimes the costs are greater than the benefits.

Competitive markets bring us great, inexpensive cars—competitive classrooms create all sorts of problems for motivation, learning, and behavior—especially if we’re serious about educating all kids. Education is about learning; business is primarily about performance, products, and profits. Different dynamics apply in education vs. business.

There’s no evidence I know of that competition between schools is beneficial overall. Private schools serve a legitimate function, but don’t do any better than public schools on apples to apples comparisons. Where charter schools are very loosely regulated, as here in Ohio, we’ve had disasters. Many of my students teach in charter schools, and their stories are just as disturbing as what I hear from Cleveland public schools. Once you distinguish between the various meanings of compete/competitive, the research on true competition in education is pretty depressing.

5) It has been a long-accepted professional and ethical standard that no single assessment be used for high-stakes decisions. That’s one reason dozens of professional, civil rights and religious organizations have signed FairTests’s petition regarding changing NCLB. I believe you could get a disclaimer from every major testmaker—either on the web or on their materials—that tests are not designed to be used this way. (This is a big reason academics are seething—this is educational malpractice.)

A sampling of quotes:

  • "When these tests are used exclusively for graduation, I think that's wrong."—Eugene Paslov, President, Harcourt Brace
  • ". . .Even if the SOL tests were beyond reproach, the use of test scores as the ultimate criterion for graduation decision violates professional standards for test use. Test scores should inform professional opinion, not override it."—Dr. Laurence H. Cross, Professor of Educational Research, Evaluation & Policy Studies
  • "High-stakes decisions based on school-mean proficiency are scientifically indefensible. We cannot regard differences in school-mean proficiency as reflecting differences in school effectiveness. . . . To reward schools for high mean achievement is tantamount to rewarding those schools for serving students who were doing well prior to school entry." —Stephen Raudenbush, Schooling, Statistics, & Poverty (he’s a top expert on HLM)
  • "There is no date by which all (or even nearly all) students in any subgroup, even middle-class white students, can achieve proficiency. Proficiency for all is an oxymoron, as the term 'proficiency' is commonly understood and properly used." —R. Rothstein, R. Jacobsen, & T. Wilder, 11/06
  • " School leaders have a duty to undo the harm being perpetrated on schools today. " —W. James Popham (The Mismeasurement of Educational Quality)(Perhaps America’s best-known statistician. Has another book—The Truth About Testing—heavily critical of how we’re using tests. He’s written tens of thousands of test items, even helped Texas design their tests, but he knows what tests are and aren’t for).
  • "NCLB is martial law. States that were doing innovative stuff took a step backwards when NCLB came along." —John Katzman, founder, Princeton Review

6) I agree that assessment should be used for something—especially to inform altered or improved teaching. Most people will interpret many of the consequences of NCLB as punishments, and motivationally, the effects of something depend a great deal on how it is interpreted. Unfortunately, while punishments can effectively bring short-term compliance, substantial reliance on them is an inferior approach to motivation, what Stephen Covey calls the primitive carrot-and-stick motivational paradigm.

7) High stakes are well known to affect validity. Think of two people—one who knows all subjects well, and one who is not very knowledgeable, but memorizes a stack of Trivial Pursuit cards. Both can answer many of the questions, but we get fooled into believing that both people are very knowledgeable, when only one is. Under high-stakes conditions, we see something akin to memorizing Trivial Pursuit cards. To really understand something well, you have to understand a million pieces, and how the pieces fit together. Tests can only target so many pieces and a few relationships, and teachers with their feet to the fire start teaching only to those things most likely to be on the tests. The easiest way is to just teach the low-level standards in your state—that’s what’s most likely to be on the test. Unfortunately, the teacher is now targeting a lot of dots, but not enough to make a meaningful picture of history or science, or whatever. In reading, they actually sometimes assess ability to decode nonsense syllables. No one needs this skill in real life, but it’s easy to test, and it leads to people drilling kids on nonsense syllables. This inflates our judgment of whether kids can really read. Also, students who learn something in order to pass a test are more likely to forget material quickly after a test when compared to students who learn something just to learn it.

8) That a fair question--if the states can’t get it right, why trust them? Unfortunately, DOE has bullied states into doing assessment their way—by not accepting states’ plans. I believe only Maine and Nebraska have stuck it out and kept something interesting. Maybe Rhode Island. Many states doing interesting assessments returned to bubble sheets when pressured. If we treat people like sheep, we won’t get creative solutions.

9) Fair enough—what’s the alternative assessment plan? Linda Darling-Hammond provides some examples in Right to Learn. You need multiple types of assessments, and multiple folks looking at the data—definitely including people outside the school.

10) What do you mean by objective? Standardized? We say “objective test” a lot in education, but there is no such thing as objectivity in testing. In constructing tests, human subjectivity creeps in at every single step. Sack’s book Standardized Minds is a good source on this.

11) I believe they re-evaluated Follow Through a while ago, using newer statistical techniques, and there was no longer any clear advantage for Direct Instruction. However, cooperative learning has a lot of research support, as the NRP report notes.

12) People have been saying the sky is falling on American education for a century. Since a Nation at Risk, we had our two biggest economic expansions in history. When do we put the Chicken Little reports in some perspective? It’s raining in some places, even damaging hail in some places, but U.S. kids have been middling on tests for decades, yet those same kids keep going out into the world and outperforming kids with higher test scores from other countries. The tests have a place, but we’re trusting them way too much. I saw somewhere recently that among the most competitive countries economically, test scores are negatively correlated with economic competitiveness.

We were cleverly sold on the idea schools were failing terribly and schools and teachers were behaving irresponsibly. That’s convenient spin, but it misleads the public about the complex story. We have “failing police stations” in most of the same neighborhoods, but we haven’t framed it that way, or gone after cops the way we went after teachers, because we know that’s a misleading way to frame it.

13) More assessment data can be very helpful, but do we want meaningful evidence of real-world competence, or high-stakes testing? You can’t have both. With all this testing, no one in DC or Columbus really knows how well your local teacher is teaching. When I say fraud, I mean the gov’t is taking our tax dollars to pay for all this testing, and there are so many holes in the system, it’s essentially useless. And value added will just add another layer to the illusion that we have meaningful accountability—and watch your wallet on that.

14) There are all sorts of important outcomes that cannot be bubbled in, including ability to carry out scientific experiments, and all sorts of technology applications. Paper-and-pencil testing is extraordinarily limiting, and even if you do essays, that has been reduced to pat formulas for writing a good 5-paragraph essay. Paper-and-pencil testing works best for math—which is why they always drag out math tests—but is much weaker for science, technology, etc. If you look at the outcomes most valued by parents and the business community, including items from SCANS report, many of them are not assessed on these tests, as well as the more complex content standards states have identified. The same thing is true in teacher ed.—what students do on some paper and pencil tests is both limited and misleading. I have them role play all sorts of things and we watch them in the classroom. What they can do on paper is just the beginning.

Anyone else care to share their thoughts, pro or con?

0 Comments:

Post a Comment

<< Home