AI scores high in gaokao language tests, low in math - Chinadaily ...
Artificial intelligence has shown impressive performance in Chinese literature and English language but struggled in mathematics, as per a study using various chatbot tools for answering this year's national college entrance exams, gaokao.
AI Performance in Different Subjects
The Shanghai Artificial Intelligence Laboratory conducted a study with six open-source AI models and the latest version GPT-4o by Open AI. The results, released recently, revealed that AI scored an average accuracy rate of 67% in Chinese language and literature, 81% in English, but only 36% in mathematics.
The top-performing model was Alibaba's Qwen2-72B, achieving a 72% accuracy rate, followed by GPT-4o and a model from the Shanghai Artificial Intelligence Laboratory.
Evaluation of AI Test-takers
Graders mentioned that AI tools showed better understanding of contemporary Chinese text but struggled with classical Chinese. They excelled in memorization of formulas but faced challenges in applying them effectively during problem-solving.
During the preliminary round of the 2024 Alibaba Global Mathematics Competition, AI teams had mediocre results with the highest score being 34 as compared to the human high score of 113.
Analysis and Feedback
Cao Sanxing from the Communication University of China noted that AI's weaker performance in math does not necessarily reflect reasoning or calculation weaknesses but rather a lack of focus on math training in AI development compared to language-related training.
Despite AI's proficiency in languages, experts point out flaws like contradictory statements and lack of deep thinking in AI-generated content. Xu Yi from Renmin University highlighted AI's strength in data analysis for summarizations but its limitations in logical thinking.
Xiong Bingqi, director of the 21st Century Education Research Institute, attributed the lower math scores to insufficient math-related programming and emphasized the need for students to develop critical thinking skills alongside knowledge memorization.