Statistical analysis can detect when ChatGPT is used to cheat on multiple-choice exams
As the use of generative artificial intelligence continues to extend into all reaches of education, much of the concern related to its impact on cheating has focused on essays, essay exam questions, and other narrative assignments. Use of AI tools such as ChatGPT to cheat on multiple-choice exams has largely gone ignored.
A Florida State University chemist is half of a research partnership whose latest work is changing what we know about this type of cheating, and their findings have revealed how the use of ChatGPT to cheat on general chemistry multiple-choice exams can be detected through specific statistical methods. The work was published in the Journal of Chemical Education.
"While many educators and researchers try to detect AI assisted cheating in essays and open-ended responses, such as Turnitin AI detection, as far as we know, this is the first time anyone has proposed detecting its use on multiple-choice exams," said Ken Hanson, an associate professor in the FSU Department of Chemistry and Biochemistry.
Detecting Cheating Behavior
Researchers collected previous FSU student responses from five semesters worth of exams, input nearly 1,000 questions into ChatGPT and compared the outcomes. By evaluating differences in performances between student- and ChatGPT-based multiple-choice chemistry exams, they were able to identify ChatGPT instances across all exams with a false positive rate of almost zero.
By using fit statistics, researchers fixed the ability parameters and refit the outcomes, finding ChatGPT's response pattern was clearly different from that of the students.
Rasch Modeling and Fit Statistics
The duo's strategy of employing a technique known as Rasch modeling and fit statistics can be readily applied to any and all generative AI chatbots, which will exhibit their own unique patterns to help educators identify the use of these chatbots in completing multiple-choice exams.
"The collaboration between Ken and I, though remote, has been a really seamless, smooth process," Sorenson said. "Our work is a great way to provide supporting evidence when educators might already suspect that cheating may be happening. What we didn't expect was that the patterns of artificial intelligence would be so easy to identify."
Conclusion
This research sheds light on the evolving methods of academic dishonesty facilitated by AI tools like ChatGPT and emphasizes the importance of staying vigilant in maintaining the integrity of assessments in educational settings.
More information:
Benjamin Sorenson et al, Identifying Generative Artificial Intelligence Chatbot Use on Multiple-Choice, General Chemistry Exams Using Rasch Analysis, Journal of Chemical Education (2024). DOI: 10.1021/acs.jchemed.4c00165
Journal information:
Provided by Florida State University