Instructor Resources
January 10, 2024

What I Learned From Instructors Using ChatGPT

Blurred Zoom screen with many attendees

I recently embarked on a journey to interview instructors who used ChatGPT in their past courses to generate 1) grading suggestions and 2) feedback for their students. Before you raise your eyebrows and assume that these instructors using AI are lazy or that students deserve better (actually been said to me), not all instructors have the luxury of time.

More importantly, I wanted to understand the why. I proposed a simple question: why did you want to use ChatGPT? Two main reasons came up:

1) Minimize time spent on grading - more on this later

2) Provide more feedback for student drafts

Saving time on grading isn’t as simple as it sounds. All the instructors I interviewed knew that Gen AI wasn’t perfect and every response required human review. You can’t just throw a prompt at the AI and expect a human-level response.

Secondly, these instructors treated a capable AI not as a replacement for their work but as an enabler for more student interactions. Instructors know students can benefit from more feedback but don’t have the capacity.

You can probably guess their experiments with ChatGPT didn’t go so well. I summarized their issues into 2 major categories.

The problem with scalability: ChatGPT, to many in the industry, is a tech demo. Using ChatGPT as is will introduce significant scalability issues.

  1. Context and volume limitations: When instructors grade, they grade in batches. OpenAI has restrictions on usage limits in terms of requests and context. Many of the instructors hit these limits.
  2. Prompt engineering: Crafting an elaborate prompt of the assignment details, grading rubric, and specialized instruction took considerable effort and time.
  3. Copy and paste: There was also a lot of copying & pasting. First, instructors had to copy/paste the prompt and the student submission into ChatGPT. Then, review the response and copy and paste it back into a CSV or directly onto their LMS. Ergh!

Consistency Concerns: Inconsistency was also a big problem. This is what one interviewee said:

“Some feedback was good, but others I had to rewrite. So I asked myself: what was the point?”

  1. Inconsistent responses: ChatGPT sometimes hallucinated and caused instructors to lose trust. Inconsistency is unavoidable - even for human grading. But if the inconsistency is too high, it becomes unreliable.
  2. More trouble than it’s worth: This inconsistency meant instructors had to spend additional time reviewing and correcting AI-generated feedback.
  3. Maintaining context is bad: ChatGPT maintains the context you put in if they are in the same chat. Instructors found that past requests in the same chat elevated inconsistencies in new requests made in the same chat.

To conclude my interviews with instructors, the technology behind ChatGPT is exciting, but the interface of ChatGPT is not scalable and wasn’t built to be. The findings mirror the two fundamental issues that TimelyGrader is trying to overcome: scalability and reliability.

If you want to see what we are working on, join our weekly webinar this Friday to learn more and get a free account for grading for this year.

Chris Du

CEO, TimelyGrader.ai

Check out our other blog articles!

TimelyGrader Logo

Enhancing education with AI-powered grading and feedback.