Large language models have gained immense importance in recent years and have demonstrated outstanding results in solving various tasks. However, despite these achievements, many questions remain unanswered in the context of large language models. Besides the optimal use of the models for inference and the alignment of the results to the desired specifications, the transfer of models to other languages is still an underdeveloped area of research. The recent publication of models such as Llama-2 and Zephyr has provided new insights into architectural improvements and the use of human feedback. However, insights into adapting these techniques to other languages remain scarce. In this paper, we build on latest improvements and apply the Direct Preference Optimization(DPO) approach to the German language. The model is available at https://huggingface.co/DRXD1000/Phoenix.
翻译:大语言模型近年来已获得显著重要性,并在解决各类任务中展现出卓越效果。然而,尽管取得这些成就,大语言模型领域仍存在许多未解问题。除了模型的优化推理使用及结果与预期规范的对齐外,模型向其他语言的迁移仍是一个发展不充分的研究领域。近期发布的Llama-2和Zephyr等模型为架构改进及人类反馈的运用提供了新见解,但将这些技术适配到其他语言的研究仍然匮乏。本文基于最新进展,将直接偏好优化方法应用于德语语言。该模型发布于https://huggingface.co/DRXD1000/Phoenix。