SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Large Language Models (LLMs) have shown remarkable abilities across various tasks, yet their development has predominantly centered on high-resource languages like English and Chinese, leaving low-resource languages underserved. To address this disparity, we present SeaLLMs 3, the latest iteration of the SeaLLMs model family, tailored for Southeast Asian languages. This region, characterized by its rich linguistic diversity, has lacked adequate language technology support. SeaLLMs 3 aims to bridge this gap by covering a comprehensive range of languages spoken in this region, including English, Chinese, Indonesian, Vietnamese, Thai, Tagalog, Malay, Burmese, Khmer, Lao, Tamil, and Javanese. Leveraging efficient language enhancement techniques and a specially constructed instruction tuning dataset, SeaLLMs 3 significantly reduces training costs while maintaining high performance and versatility. Our model excels in tasks such as world knowledge, mathematical reasoning, translation, and instruction following, achieving state-of-the-art performance among similarly sized models. Additionally, we prioritized safety and reliability by addressing both general and culture-specific considerations and incorporated mechanisms to reduce hallucinations. This work underscores the importance of inclusive AI, showing that advanced LLM capabilities can benefit underserved linguistic and cultural communities.

翻译：大语言模型（Large Language Models, LLMs）已在多种任务中展现出卓越能力，然而其发展主要集中于英语和中文等高资源语言，导致低资源语言服务不足。为解决这一差距，我们推出了SeaLLMs模型家族的最新版本——SeaLLMs 3，专为东南亚语言定制。该地区语言多样性丰富，但长期缺乏足够的语言技术支持。SeaLLMs 3旨在通过覆盖该地区广泛使用的语言来弥合这一鸿沟，包括英语、中文、印尼语、越南语、泰语、他加禄语、马来语、缅甸语、高棉语、老挝语、泰米尔语和爪哇语。通过利用高效的语言增强技术和专门构建的指令微调数据集，SeaLLMs 3在保持高性能和多功能性的同时，显著降低了训练成本。我们的模型在世界知识、数学推理、翻译和指令遵循等任务上表现出色，在同等规模模型中达到了最先进的性能水平。此外，我们通过处理通用及文化特定的考量，优先考虑了安全性和可靠性，并引入了减少幻觉的机制。这项工作强调了包容性人工智能的重要性，表明先进的大语言模型能力能够惠及服务不足的语言和文化社区。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日