We introduce RecurrentGemma, an open language model which uses Google's novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, which reduces memory use and enables efficient inference on long sequences. We provide a pre-trained model with 2B non-embedding parameters, and an instruction tuned variant. Both models achieve comparable performance to Gemma-2B despite being trained on fewer tokens.
翻译:我们提出RecurrentGemma,一个使用谷歌新型Griffin架构的开放语言模型。Griffin结合线性递归与局部注意力机制,在语言任务上取得了优异性能。该模型采用固定大小的状态表示,从而降低内存使用并实现对长序列的高效推理。我们提供了一个包含2B非嵌入参数的预训练模型及其指令调优变体。尽管训练所用的token数量更少,这两个模型均达到了与Gemma-2B相当的性能水平。