Kids who use ChatGPT as a study assistant do worse on tests
Support nonprofit newsThe Hechinger Report Covering Innovation & Inequality in Education
Get important education news and analysis delivered straight to your inbox.
Does AI actually help students learn? A recent experiment in a high school provides a cautionary tale. Researchers at the University of Pennsylvania found that Turkish high school students who had access to ChatGPT while doing practice math problems did worse on a math test compared with students who didn’t have access to ChatGPT. Those with ChatGPT solved 48 percent more of the practice problems correctly, but they ultimately scored 17 percent worse on a test of the topic that the students were learning.
AI as a Study Assistant
A third group of students had access to a revised version of ChatGPT that functioned more like a tutor. This chatbot was programmed to provide hints without directly divulging the answer. The students who used it did spectacularly better on the practice problems, solving 127 percent more of them correctly compared with students who did their practice work without any high-tech aids. But on a test afterwards, these AI-tutored students did no better. Students who just did their practice problems the old fashioned way — on their own — matched their test scores.

Generative AI Can Harm Learning
The researchers titled their paper, “Generative AI Can Harm Learning,” to make clear to parents and educators that the current crop of freely available AI chatbots can “substantially inhibit learning.” Even a fine-tuned version of ChatGPT designed to mimic a tutor doesn’t necessarily help. The researchers believe the problem is that students are using the chatbot as a “crutch.”

ChatGPT’s errors also may have been a contributing factor. The chatbot only answered the math problems correctly half of the time. Its arithmetic computations were wrong 8 percent of the time, but the bigger problem was that its step-by-step approach for how to solve a problem was wrong 42 percent of the time. The tutoring version of ChatGPT was directly fed the correct solutions and these errors were minimized.
Experiment Details
A draft paper about the experiment was posted on the website of SSRN, formerly known as the Social Science Research Network, in July 2024. The paper has not yet been published in a peer-reviewed journal and could still be revised. This is just one experiment in another country, and more studies will be needed to confirm its findings. But this experiment was a large one, involving nearly a thousand students in grades nine through 11 during the fall of 2023.
ChatGPT seems to produce overconfidence among students, leading to a false sense of achievement. The authors likened the problem of learning with ChatGPT to autopilot, emphasizing the importance of ensuring that students develop essential problem-solving skills rather than relying solely on technology.

ChatGPT is not the first technology to present a tradeoff in education. While technological aids may improve performance in the short term, the long-term impact on learning outcomes can be detrimental. Students need to build foundational skills through independent practice and critical thinking.
Overall, the study raises concerns about the overreliance on AI chatbots like ChatGPT in educational settings and highlights the importance of promoting active learning strategies for students.
This story about using ChatGPT to practice math was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education.