K-EXAONE技术报告 (K-EXAONE Technical Report)

Eunbi Choi,Kibong Choi,Seokhee Hong,Junwon Hwang,Hyojin Jeon,Hyunjik Jo,Joonkee Kim,Seonghwan Kim,Soyeon Kim,Sunkyoung Kim,Yireun Kim,Yongil Kim,Haeju Lee,Jinsik Lee,Kyungmin Lee,Sangha Park,Heuiyeen Yeen,Hwan Chang,Stanley Jungkyu Choi,Yejin Choi,Jiwon Ham,Kijeong Jeon,Geunyeong Jeong,Gerrard Jeongwon Jo,Yonghwan Jo,Jiyeon Jung,Naeun Kang,Dohoon Kim,Euisoon Kim,Hayeon Kim,Hyosang Kim,Hyunseo Kim,Jieun Kim,Minu Kim,Myoungshin Kim,Unsol Kim,Youchul Kim,YoungJin Kim,Chaeeun Lee,Chaeyoon Lee,Changhun Lee,Dahm Lee,Edward Hwayoung Lee,Honglak Lee,Jinsang Lee,Jiyoung Lee,Sangeun Lee,Seungwon Lim,Solji Lim,Woohyung Lim,Chanwoo Moon,Jaewoo Park,Jinho Park,Yongmin Park,Hyerin Seo,Wooseok Seo,Yongwoo Song,Sejong Yang,Sihoon Yang,Chang En Yea,Sihyuk Yi,Chansik Yoon,Dongkeun Yoon,Sangyeon Yoon,Hyeongu Yun

from arxiv, 29 pages

This technical report presents K-EXAONE, a large-scale multilingual language model developed by LG AI Research. K-EXAONE is built on a Mixture-of-Experts architecture with 236B total parameters, activating 23B parameters during inference. It supports a 256K-token context window and covers six languages: Korean, English, Spanish, German, Japanese, and Vietnamese. We evaluate K-EXAONE on a comprehensive benchmark suite spanning reasoning, agentic, general, Korean, and multilingual abilities. Across these evaluations, K-EXAONE demonstrates performance comparable to open-weight models of similar size. K-EXAONE, designed to advance AI for a better life, is positioned as a powerful proprietary AI foundation model for a wide range of industrial and research applications.

翻译：本技术报告介绍了由LG AI Research开发的大规模多语言语言模型K-EXAONE。K-EXAONE基于混合专家架构构建，总参数量为2360亿，推理时激活参数量为230亿。它支持256K令牌的上下文窗口，并涵盖六种语言：韩语、英语、西班牙语、德语、日语和越南语。我们在涵盖推理、代理、通用、韩语及多语言能力的综合基准测试套件上评估了K-EXAONE。在这些评估中，K-EXAONE展现出与相似规模的开源权重模型相当的性能。K-EXAONE旨在通过人工智能推动更美好的生活，被定位为一款强大的专有AI基础模型，适用于广泛的工业和研究应用。

相关内容

关注 7093

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

国产大模型DeepSeek-V3一夜火爆全球，《DeepSeek-V3技术报告》，53页pdf

专知会员服务

88+阅读 · 2024年12月27日

《Llama 3大模型》技术报告中英文版，95页pdf

专知会员服务

107+阅读 · 2024年8月2日

【ChatGPT系列报告】不同大语言模型产品操作性能及进阶应用比较，17页pdf

专知会员服务

99+阅读 · 2023年6月19日

《TextCycleGAN 技术报告》

专知会员服务

33+阅读 · 2023年5月4日