Through additional training, we explore embedding specialized scientific knowledge into the Llama 2 Large Language Model (LLM). Key findings reveal that effective knowledge integration requires reading texts from multiple perspectives, especially in instructional formats. We utilize text augmentation to tackle the scarcity of specialized texts, including style conversions and translations. Hyperparameter optimization proves crucial, with different size models (7b, 13b, and 70b) reasonably undergoing additional training. Validating our methods, we construct a dataset of 65,000 scientific papers. Although we have succeeded in partially embedding knowledge, the study highlights the complexities and limitations of incorporating specialized information into LLMs, suggesting areas for further improvement.
翻译:通过额外训练,我们探索将特定科学知识嵌入Llama 2大型语言模型(LLM)的方法。关键发现表明,有效的知识整合需要从多角度阅读文本,尤其是教学格式的文本。我们利用文本增强技术来解决特定领域文本稀缺问题,包括风格转换和翻译。超参数优化至关重要,不同规模的模型(7b、13b和70b)均能合理地进行额外训练。为验证方法,我们构建了一个包含65,000篇科学论文的数据集。尽管已成功实现了部分知识嵌入,但本研究揭示了将专业信息融入LLM中的复杂性与局限性,并指出了进一步改进的方向。