This paper details the process of developing the first native large generative language model for the Nordic languages, GPT-SW3. We cover all parts of the development process, from data collection and processing, training configuration and instruction finetuning, to evaluation and considerations for release strategies. We hope that this paper can serve as a guide and reference for other researchers that undertake the development of large generative models for smaller languages.
翻译:本文详述了为北欧语言开发首个原生大型生成式语言模型GPT-SW3的过程。我们覆盖了开发流程的全部环节,从数据收集与处理、训练配置与指令微调,到模型评估与发布策略的考量。我们期望本文能为其他致力于为较小语种开发大型生成式模型的研究者提供指南与参考。