This paper details the process of developing the first native large generative language model for the Nordic languages, GPT-SW3. We cover all parts of the development process, from data collection and processing, training configuration and instruction finetuning, to evaluation and considerations for release strategies. We hope that this paper can serve as a guide and reference for other researchers that undertake the development of large generative models for smaller languages.
翻译:本文详细介绍了开发首个面向北欧语言的原生大型生成式语言模型GPT-SW3的过程。我们从数据收集与处理、训练配置与指令微调,到评估与发布策略考量,全面阐述了开发流程的各个环节。希望本文能为致力于开发面向小语种的大型生成式模型的研究人员提供指导与参考。