BYU students outperform OpenAI's chatbot in accounting exam

Published On Sat May 13 2023

77% vs 47%: At US university, students trump ChatGPT in accounting exam

The Brigham Young University (BYU) in the US recently put ChatGPT, the revolutionary AI-powered chatbot developed by OpenAI, to test, and researchers observed that BYU students fared better than ChatGPT in the accounting exam. However, the OpenAI tool's overall performance was found to be impressive and a game changer that will change the way everyone teaches and learns.

The exercise aimed to determine how ChatGPT would fare in the accounting subject, and the researchers published their findings in the journal ‘Issues in Accounting Education.’ While the students scored an overall average of 77% (76.7%), the AI bot scored 47% (47.4%). In 11.3% questions, it performed better than the humans, particularly excelling in accounting information systems (AIS) and auditing. In tax, financial, and managerial assessments, on the other hand, its performance was extremely poor.

The researchers made some key observations:

The chatbot did better in true/false questions (68.7% correct) and multiple-choice questions (59.5%).
It struggled with short-answer questions (28.7% to 39.1%) and generally higher-order questions.
The chatbot provided explanations for several answers, even if the answers were wrong; in other cases, it selected wrong multiple-choice answers, despite giving accurate descriptions.
It was found to make up facts, generating real-looking references that were fabricated.
The chatbot made errors for simple mathematical operations, like adding numbers in a subtraction problem, or dividing numbers incorrectly.

The study had a total of 327 co-authors from Brigham and 186 other educational institutes in 14 countries (including the US). Besides sitting for the exam, BYU students fed textbook questions to ChatGPT.

In conclusion, the study showed that while the AI-powered chatbot developed by OpenAI is impressive, there is still a long way to go for AI to replace human intelligence entirely.