Monday, April 25, 2005

Column Five

I was lucky to attend an elementary school where the teachers served us trail mix during standardized tests. I came to think of them as an opportunity for snacking. But years later, when I took my GRE on a computer, the administrators forced me to stash my trail mix in a locker. Either they were worried I had written vocabulary words on the peanuts or that I would spill on the keyboard (given my history, the latter would be a perfectly reasonable fear.)

At any rate, computers permit more than just taking traditional tests on a screen; they enable entirely new testing methodologies. Thus, the GRE and the GMAT are now computer adaptive, which means that they get harder when you answer questions correctly and easier when you make mistakes. This allows them to pinpoint your score by seeing how far up or down you diverge from the performance of an “average” test-taker.

Such a Choose-Your-Own-Adventure approach to test administration may be more efficient, but it also introduces new sources of test-taking anxiety. For instance, you can’t go back and change your previous answers, or skip tough questions and return to them later. In fact, you can’t skip questions at all. Worse, if an easy one comes up, it’s cause for panic—since you must have done something wrong to deserve it. And if you get an incredibly hard one, that’s great!—except you may not be able to answer it.

This new emphasis on computerized testing may explain why after 27 years, the nation’s leading producer of mechanically-scored paper testing forms, Scantron Corporation, spun off a new business unit to explore opportunities on the Internet. This unit’s products have included web-based software that allows students to “take their tests online or on paper”—a clear concession to the growing popularity of Internet testing, and one sure to strike fear in the heart of #2 pencil manufacturers.

It’s one thing to have computers decide whether you answered “b” correctly and then adjust the next question accordingly. It’s another for them to score the quality of your prose. In Michigan, computers are doing just that, grading sixth graders’ essays. One company, Pearson Knowledge Technologies, offers products employing “Latent Semantic Analysis” to perform “automatic writing assessment.” Pearson claims that its software, unlike human graders, never gets tired or bored. It even rates essays for style.

I do agree that essay test responses should be typed whenever possible. Nowadays people grow up using word processors that allow them to cut-and-paste; it seems awkward to test them on a skill (straight-through-writing, for lack of a better term) that they don’t practice very much. Typed responses also eliminate potential bias against students with bad handwriting.

But in life, we don’t write for computers (unless we’re in the CS department.) Computers don’t shop at bookstores. They don’t really understand the difference between the verse of Walt Whitman and that of a Stanford Daily columnist who dabbles in rhyming ditties. Do we really want our sixth graders conditioned to write text in a way that will satisfy a software algorithm?

I’d rather consider existing problems in multiple choice testing and look for new ways to resolve them. For instance, some current standardized exams aim to penalize guessing by deducting a portion of a point for a wrong answer. However, most test-takers are still advised to guess as much as possible, particularly if they can eliminate at least one answer choice.

Instead of a straight-up guessing penalty, I propose introducing a new aspect to multiple-choice testing: a “certainty factor.” If it were implemented, you would be able to choose not only your preferred answer—a, b, c, d or e—but also how certain you are of it—20, 40, 60, 80 or 100%. If you were 100% certain and you got it right, you would receive full credit—say, 1 point. If you were only 60% certain and you got it right, you would only receive 0.6 points.

Conversely, if you were 100% certain and got it wrong, you would incur the full wrong-answer penalty—perhaps 0.5 points. But if you were only 40% certain and got it wrong, your penalty would shrink to 40% of 0.5, or 0.2 points.

This kind of testing would require students not only to know something, but to consider how well they know it. Critics might complain that this makes test-taking too much like gambling. I disagree—it’s simply a formalized extension of what is already taking place every time someone decides to make a guess. Furthermore, the testing process would teach important lessons in risk management and decision-making—and significantly reduce the possibility of people who are naturally "good test-takers" guessing their way to a high score.

Are certainty factors the future of testing? I’d love to say yes, but I know it’s not too likely. I’ll give it… 20%.

Daniel Berdichevsky hopes his editor doesn’t run this column through “Latent Semantic Analysis.” He welcomes e-mails at (a) dan@demidec.com, (b) demidec@gmail.com and (c) dberdich@stanfordalumni.org.

1 comment:

Unknown said...

We definitely talked about this.