In recent years Large Language Models (LLMs) have increased the state of the art on several natural language processing tasks. However, their accessibility is often limited to paid API services, posing challenges for researchers in conducting extensive investigations. On the other hand, while some open-source models have been proposed by the community, they are typically multilingual and not specifically tailored for the Italian language. In an effort to democratize the available and open resources for the Italian language, in this paper we introduce Camoscio: a language model specifically tuned to follow users' prompts in Italian. Specifically, we finetuned the smallest variant of LLaMA (7b) with LoRA on a corpus of instruction prompts translated to Italian via ChatGPT. Results indicate that the model's zero-shot performance on various downstream tasks in Italian competes favorably with existing models specifically finetuned for those tasks. All the artifacts (code, dataset, model) are released to the community at the following url: https://github.com/teelinsan/camoscio
翻译:近年来,大语言模型在多项自然语言处理任务中提升了现有技术水平。然而,其访问权限通常仅限于付费API服务,给研究人员开展深入探索带来了挑战。另一方面,尽管社区已提出部分开源模型,但这些模型多为多语言设计,并未针对意大利语进行专门适配。为促进意大利语开放资源的普及,本文介绍了Camoscio:一个专门针对意大利语用户提示进行优化的语言模型。具体而言,我们采用LoRA方法对LLaMA的最小变体(7b)进行了微调,训练数据是基于ChatGPT翻译成意大利语的指令提示语料。结果表明,该模型在各种意大利语下游任务中的零样本表现与针对这些任务专门微调的现有模型相比具有竞争力。所有相关资源(代码、数据集、模型)均已向社区开放,可通过以下网址获取:https://github.com/teelinsan/camoscio