ChatGPT scores nearly 50 per cent on board certification practice
An artificial intelligence tool, ChatGPT, has been tested in terms of answering questions from a study resource that is often used by doctors to prepare for board certification in ophthalmology. When the study first occurred in January 2023, ChatGPT recorded 46% of the questions answered correctly. The results improved one month later, recording more than 10% higher.
The tool was first introduced to the public in November 2022. People can access ChatGPT for free with an internet connection, and it works in a conversational way. However, due to the potential for incorrect information and cheating in academia, researchers are warning that responsible use of AI systems, like ChatGPT, is necessary.
In a study led by St. Michael's Hospital, a site of Unity Health Toronto, researchers utilized a dataset of practice multiple choice questions from the free trial of OphthoQuestions to test ChatGPT. For the test, entries and conversations cleared before each question to prevent concurrent conversation influencing responses. ChatGPT does not accept image or video input, so only text-based multiple-choice questions were used.
During this study, ChatGPT provided the correct answer for 58 out of 125 text-based multiple-choice questions, accounting for 46% of questions answered correctly. In February 2023, the results had improved to 58%.
When compared to the most popular answers given by ophthalmology trainees, it was found that ChatGPT closely matched them, providing the same multiple-choice response 44% of the time.
According to Dr. Marko Popovic, a co-author of the study and a resident physician in the Department of Ophthalmology and Vision Sciences at the University of Toronto, "ChatGPT is an artificial intelligence system that has tremendous promise in medical education. Though it provided incorrect answers to board certification questions in ophthalmology about half the time, we anticipate that ChatGPT's body of knowledge will rapidly evolve."
The system was most accurate when answering general medicine questions, with an accuracy rate of 79%. However, when it came to ophthalmology subspecialties, like oculoplastics and retina, the accuracy rate was notably lower, with only 20% and 0% accuracy, respectively.
Despite the fact that ChatGPT does not offer adequate preparation for board certification exams, researchers still believe it has tremendous potential in the medical industry. Its accuracy rate will improve over time, especially as it adapts to niche subspecialties.