Artificial Intelligence Grades Essays

AILately, the Twitterverse has been abuzz with links to content about edX, a gigantic online course venture by Harvard and MIT that will release free artificial intelligence essay-grading software.

In an April 4 New York Times article, “Essay-Grading Software Offers Professors a Break,” journalist John Markoff writes a fantastic lead:

Imagine taking a college exam, and instead of handing in a blue book and getting a grade from a professor a few weeks later, clicking the “send” button when you are done and receiving a grade back instantly, your essay scored by a software program.

The media needs more journalists like Markoff to report objective, well-balanced pieces that inform readers with varying viewpoints. But I’m unsure how I feel about a computer, however intelligent, grading written work. There is something unsettling about allowing artificial intelligence to judge creativity, critical thinking, elegance, wit, or a complete lack thereof.

I see the allure of automated essay assessment, especially for managing a Massive Open Online Course (MOOC). I recently spoke with Prof. Curtis J. Bonk, a leading distance-learning expert, who tells me of MOOC instructors with 40,000 students. Not even the most dedicated professors, with an army of equally dedicated teaching assistants, could ever hope to read and provide feedback on as many essays.

Still, I can’t help but feel that any type of automated grading somehow cheapens the learning process—and the teacher-student relationship. This just seems dishonest, almost as if I’m cutting a corner during a 5K race. But I’m absolutely elated by how students could use this software to accurately assess their work before submitting it to a teacher.

I’m curious to learn how this artificial intelligence works, and I try several times (to no avail) to reach Prof. Anant Agarwal, president of edX. I’m sure he’s busy getting back to the mainstream media, but as somebody who has no idea about algorithms and complex code, perhaps this is a blessing in disguise. I know my own limitations, and I wouldn’t have any idea about how to break down techno-babble for the layperson to understand.

Fortunately, I stumble upon an excellent April 8 podcast from WBUR, Boston’s NPR news station.

I listen to Mark Shermis, Professor of Education and Psychology at the University of Akron. I’m intrigued by his support of machine-based essay grading:

“With the technology, if teachers elect to use it, and they’re not constrained to use, but if they elect to use it, they’ll be able to grade more essays and give their students more opportunities for writing,” Perelman says. “And there’s no way that the human grading operations is going to be able to handle the amount of assessment that’s going to be under way.”

Feel free to criticize Shermis, but it’s impossible to argue that increasing numbers of students, along with the explosion of MOOCs, aren’t placing unrealistic expectations on teachers to provide personalized, quality feedback on written work.

Shermis strikes me as an exceptionally rational individual. Now, he doesn’t believe that artificial intelligence can ascertain how effectively an argument is written. But, he says, “the technology can tell you whether you’re on topic.”

I would welcome effective software the helps my students stay on topic, but it is self-reflection, not assessment, that I see as the greatest use of up-and-coming essay-grading software.

I’m also interested to hear from Les Perelman, former director of writing at MIT and President of the Consortium for the Research and Evaluation of Writing. He has launched an online petition, protesting machine-based grading. As I search the Web, he seems to be the biggest, most credible critic:

I can say it in one sentence: it doesn’t work. What computers do very well and what all compute-grading programs do is count. They count the number of words, they count the average number of letters per word for length, they count the number of sentences per paragraph, they count the number of connecting words, but they can’t understand meaning at all… All you have to do is give students lists of big words and all they have to do is write the first sentence of a paragraph on topic and then fill the rest of the paragraph with lots of sentences with big words and they’ll do very, very well.

I speak with Bonk about whether some are putting too much hope in machine-based grading, expecting it to serve as a panacea for stressed-out teachers.

“What we’re reading in the news are announcements from computer science people and engineers,” Bonk says. “”We’re not hearing from art history or English literature, or sociology professors all that much, or high school teachers.”

Bonk’s words impact me on a deep level. Should outsiders prosper from infringing upon how others work—and should they do so without asking or involving those directly affected?

I think of school IT departments, unilaterally deciding what computers to purchase for one-on-one programs, without asking teachers or students what they would find most useful. I’m angry.

As a coach, history and journalism teacher at Brimmer and May, a wonderful independent school in Chestnut Hill, Massachusetts, I have the absolute best job in the world. I am thrilled to get up every morning to engage with interesting young people, and I'm equally fortunate to have such amazing colleagues and mentors. As the founder of Spin Education, I encourage you to check-in frequently and submit posts and lessons—all in an effort to better our practice as teachers.

1 Comment

  • Reply February 10, 2016


    You put the lime in the coucnot and drink the article up.

Leave a Reply