This study evaluates the performance of large language models, specifically GPT-3.5 and BARD (supported by Gemini Pro model), in undergraduate admissions exams proposed by the National Polytechnic Institute in Mexico. The exams cover Engineering/Mathematical and Physical Sciences, Biological and Medical Sciences, and Social and Administrative Sciences. Both models demonstrated proficiency, exceeding the minimum acceptance scores for respective academic programs to up to 75% for some academic programs. GPT-3.5 outperformed BARD in Mathematics and Physics, while BARD performed better in History and questions related to factual information. Overall, GPT-3.5 marginally surpassed BARD with scores of 60.94% and 60.42%, respectively.
翻译:本研究评估了大型语言模型(特别是GPT-3.5和基于Gemini Pro模型的BARD)在墨西哥国立理工学院提出的本科入学考试中的表现。考试涵盖工程/数理科学、生物医学科学以及社会科学与管理科学三个领域。两种模型均展现出较高能力,部分学术项目的成绩超过最低录取标准高达75%。GPT-3.5在数学和物理科目中优于BARD,而BARD在历史及事实信息相关问题上表现更佳。总体而言,GPT-3.5以60.94%和60.42%的得分略微领先于BARD。