Large Language Models (LLMs) exploit fine-tuning as a technique to adapt to diverse goals, thanks to task-specific training data. Task specificity should go hand in hand with domain orientation, that is, the specialization of an LLM to accurately address the tasks of a given realm of interest. However, models are usually fine-tuned over publicly available data or, at most, over ground data from databases, ignoring business-level definitions and domain experience. On the other hand, Enterprise Knowledge Graphs (EKGs) are able to capture and augment such domain knowledge via ontological reasoning. With the goal of combining LLM flexibility with the domain orientation of EKGs, we propose a novel neurosymbolic architecture that leverages the power of ontological reasoning to build task- and domain-specific corpora for LLM fine-tuning.
翻译:大型语言模型借助微调技术,基于任务特定的训练数据来适应多样化的目标。任务特定性应与领域导向性相辅相成,即专门使大语言模型准确处理特定兴趣领域的任务。然而,模型通常使用公开可用数据进行微调,或最多使用数据库中的原始数据,忽略了业务层面的定义和领域经验。另一方面,企业知识图谱能够通过本体推理捕获并增强此类领域知识。为兼顾大语言模型的灵活性与企业知识图谱的领域导向性,我们提出一种新颖的神经符号架构,该架构利用本体推理能力构建面向任务和领域的语料库,用于大语言模型的微调。