Most, if not all, are using the free version of ChatGPT, meaning that they are on GPT-3.5. They do a few tests with a simple prompt and conclude that AI isn’t as good as people say.
I’ve got an assignment where students need to identify an innovative company and conduct research on the impact of its innovative strategies. As a bonus, students should tie their analysis to things learned in class such as Porter’s Competive Strategy. I uploaded a student submission through ChatGPT 3.5 and this is what it gave me.
Overall, it’s not bad. The feedback is generic and somewhat surface-level. I would argue that it’s doing a better job than many instructors simply because they can’t give detailed feedback like this to every student.
Now, with ChatGPT premium which runs GPT-4, this is a part of the feedback it gave me. It dived deep into each recommendation and tied elements of Porter’s book into the feedback. Each recommendation is actionable but doesn’t give away the answer which fits within the AI’s role of providing feedback and not the answer.
3.5 to 4 might not look like a big change but the leap from GPT-3.5 to GPT-4 is massive. Besides deeper analysis and understanding of prompts, it’s much more capable of executing the tasks outlined in the prompt than ignoring certain tasks.
As you can see from our little experiment, there is significantly more depth to the feedback and each recommendation is personalized.
Another improvement is the total context limit depending on which model you are using. For example, GPT 3.5 models can only contain 8K tokens (roughly 8K words). Newer models like GPT-4 Turbo can contain up to 128K. This makes reading research papers, reports, and other long writing grading possible.
With TimelyGrader, you can upload 600 student submissions at a time and it will take 5 to 10 minutes to process before you can review suggested grades and feedback. This is possible due to the power of GPT-4 Turbo.
Enhancing education with AI-powered grading and feedback.