Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with important details of their training data, architectures, and development undisclosed. Given the importance of these details in scientifically studying these models, including their biases and potential risks, we believe it is essential for the research community to have access to powerful, truly open LMs. To this end, this technical report details the first release of OLMo, a state-of-the-art, truly Open Language Model and its framework to build and study the science of language modeling. Unlike most prior efforts that have only released model weights and inference code, we release OLMo and the whole framework, including training data and training and evaluation code. We hope this release will empower and strengthen the open research community and inspire a new wave of innovation.
翻译:语言模型(LM)在自然语言处理研究及商业产品中已变得无处不在。随着其商业重要性的激增,最强大的模型已变为封闭式,通过专有接口限制访问,其训练数据、架构和开发等关键细节均未被公开。鉴于这些细节对科学地研究模型(包括其偏差和潜在风险)至关重要,我们认为研究社区有必要获得强大且真正开放的LM。为此,本技术报告详述了OLMo(一种最先进的真正开放语言模型)及其用于构建和研究语言建模科学的框架的首次发布。与大多数仅发布模型权重和推理代码的先前工作不同,我们发布了OLMo及其完整框架,包括训练数据以及训练和评估代码。我们希望这一发布将赋能并加强开放研究社区,并激发新一轮创新浪潮。