AI Chatbot ChatGPT Scores 55% in National Exam for Doctors

Published On Fri May 12 2023

AI ChatGPT Successfully Answers 55% of National Exam for Doctors

The use of artificial intelligence in the medical field has been a topic of discussion for quite some time. An AI chatbot called ChatGPT, developed by the U.S. startup OpenAI, has been tested to see how it would perform in Japan's National Examination for Medical Practitioners. The exam is a nationwide licensing test that requires test-takers to be at least sixth-year medical students at a university. Although ChatGPT did not pass the exam, it did manage to correctly answer 55% of the questions, which is higher than the score of the student who tested it.

Putting ChatGPT to the Test

Yudai Kaneda, a fifth-year student at Hokkaido University's medical department, manually inputted all 400 questions and answer choices from the February exam into ChatGPT. He was able to do this because a senior student had taken the test and brought home the question sheets. After inputting the questions, he scored the answers given by ChatGPT by referring to the sample answers that were published by a prep school specialized in the medical licensing examination.

Excluding the 11 questions that required looking at images to respond, ChatGPT gave correct answers for 55% of 389 questions. If scores given to each question are taken into consideration, ChatGPT scored 135 points out of the full marks of 197 points, achieving a score rate of 69% in the compulsory questions, which requires at least a score rate of 80% to pass. It also marked a score rate of 51%, earning 149 points out of 292 points, in the general and clinical questions, which requires around 70% to pass. However, ChatGPT did not pass both parts of the exam.

The Future of Medical Studies?

Despite not passing the exam, the successful performance of ChatGPT in answering over half of the questions correctly has raised possibilities of its use in future medical studies. Tetsuya Tanimoto, a physician at the Medical Governance Research Institute in Tokyo, compiled the paper with Kaneda. He said the AI program's result is significant and that "If a conversational AI program is developed based on medically credible literature, not dubious blogs, or something similar, it could be used for front-line medical services in the not-so-distant future."

When Kaneda takes the exam himself in two years' time, he might be able to casually ask an AI program like ChatGPT for assistance. He believes that AI will change the way of studying medicine.