Business Knowledge Graphs (KGs) are important to many enterprises today, providing factual knowledge and structured data that steer many products and make them more intelligent. Despite their promising benefits, building business KG necessitates solving prohibitive issues of deficient structure and multiple modalities. In this paper, we advance the understanding of the practical challenges related to building KG in non-trivial real-world systems. We introduce the process of building an open business knowledge graph (OpenBG) derived from a well-known enterprise, Alibaba Group. Specifically, we define a core ontology to cover various abstract products and consumption demands, with fine-grained taxonomy and multimodal facts in deployed applications. OpenBG is an open business KG of unprecedented scale: 2.6 billion triples with more than 88 million entities covering over 1 million core classes/concepts and 2,681 types of relations. We release all the open resources (OpenBG benchmarks) derived from it for the community and report experimental results of KG-centric tasks. We also run up an online competition based on OpenBG benchmarks, and has attracted thousands of teams. We further pre-train OpenBG and apply it to many KG- enhanced downstream tasks in business scenarios, demonstrating the effectiveness of billion-scale multimodal knowledge for e-commerce. All the resources with codes have been released at \url{https://github.com/OpenBGBenchmark/OpenBG}.
翻译:商业知识图谱(KG)对当今众多企业至关重要,它提供了事实知识与结构化数据,能够指导产品并提升其智能化水平。尽管商业知识图谱具有诸多潜在优势,但其构建必须解决结构缺失和多模态性等严峻挑战。本文深入探讨了在非平凡实际系统中构建知识图谱所面临的实践难题。我们介绍了基于知名企业阿里巴巴集团构建开放商业知识图谱OpenBG的过程。具体而言,我们定义了一个核心本体,用于覆盖各类抽象产品和消费需求,并在部署应用中引入细粒度分类体系与多模态事实。OpenBG是一个具有空前规模的开放商业知识图谱:包含26亿三元组、超过8800万个实体,覆盖100多万个核心类/概念及2681种关系类型。我们向学术界发布了基于该图谱的全部开放资源(OpenBG基准测试),并报告了以知识图谱为中心任务上的实验结果。我们还基于OpenBG基准测试举办了在线竞赛,吸引了数千支团队参与。我们进一步对OpenBG进行预训练,并将其应用于商业场景中多项知识图谱增强的下游任务,证明了十亿级多模态知识在电子商务领域的有效性。所有资源与代码已开源发布于\url{https://github.com/OpenBGBenchmark/OpenBG}。