OwnThink stands as the most extensive Chinese open-domain knowledge graph introduced in recent times. Despite prior attempts in question answering over OwnThink (OQA), existing studies have faced limitations in model representation capabilities, posing challenges in further enhancing overall accuracy in question answering. In this paper, we introduce UniOQA, a unified framework that integrates two complementary parallel workflows. Unlike conventional approaches, UniOQA harnesses large language models (LLMs) for precise question answering and incorporates a direct-answer-prediction process as a cost-effective complement. Initially, to bolster representation capacity, we fine-tune an LLM to translate questions into the Cypher query language (CQL), tackling issues associated with restricted semantic understanding and hallucinations. Subsequently, we introduce the Entity and Relation Replacement algorithm to ensure the executability of the generated CQL. Concurrently, to augment overall accuracy in question answering, we further adapt the Retrieval-Augmented Generation (RAG) process to the knowledge graph. Ultimately, we optimize answer accuracy through a dynamic decision algorithm. Experimental findings illustrate that UniOQA notably advances SpCQL Logical Accuracy to 21.2% and Execution Accuracy to 54.9%, achieving the new state-of-the-art results on this benchmark. Through ablation experiments, we delve into the superior representation capacity of UniOQA and quantify its performance breakthrough.
翻译:OwnThink是近年来推出的规模最大的中文开放域知识图谱。尽管先前已有针对OwnThink的问答研究,但现有工作受限于模型表示能力,难以进一步提升问答的整体准确率。本文提出UniOQA,一个融合两条互补并行流程的统一框架。与传统方法不同,UniOQA利用大语言模型实现精准问答,并引入直接答案预测流程作为高效补充。首先,为增强表示能力,我们微调大语言模型将问题转换为Cypher查询语言,以解决语义理解受限和幻觉问题。随后,我们提出实体与关系替换算法,确保生成CQL的可执行性。同时,为提升问答整体准确率,我们进一步将检索增强生成流程适配至知识图谱。最终,通过动态决策算法优化答案准确率。实验结果表明,UniOQA将SpCQL逻辑准确率显著提升至21.2%,执行准确率提升至54.9%,在该基准测试中取得了新的最优结果。通过消融实验,我们深入探究了UniOQA卓越的表示能力,并量化了其性能突破。