Through additional training, we explore embedding specialized scientific knowledge into the Llama 2 Large Language Model (LLM). Key findings reveal that effective knowledge integration requires reading texts from multiple perspectives, especially in instructional formats. We utilize text augmentation to tackle the scarcity of specialized texts, including style conversions and translations. Hyperparameter optimization proves crucial, with different size models (7b, 13b, and 70b) reasonably undergoing additional training. Validating our methods, we construct a dataset of 65,000 scientific papers. Although we have succeeded in partially embedding knowledge, the study highlights the complexities and limitations of incorporating specialized information into LLMs, suggesting areas for further improvement.
翻译:我们通过额外训练探索将专业科学知识嵌入Llama 2大型语言模型(LLM)。关键发现表明,有效的知识整合需要从多角度阅读文本,尤其是指令性格式。我们采用文本增强技术(包括风格转换与翻译)来解决专业文本稀缺问题。超参数优化至关重要,不同参数规模的模型(7b、13b和70b)均能合理地进行额外训练。为验证方法有效性,我们构建了包含65,000篇科学论文的数据集。尽管已成功实现部分知识嵌入,但本研究揭示了将专业信息注入LLM的复杂性与局限性,并提出了待改进方向。