In this paper, we introduce TigerVector, a system that integrates vector search and graph query within TigerGraph, a Massively Parallel Processing (MPP) native graph database. We extend the vertex attribute type with the embedding type. To support fast vector search, we devise an MPP index framework that interoperates efficiently with the graph engine. The graph query language GSQL is enhanced to support vector type expressions and enable query compositions between vector search results and graph query blocks. These advancements elevate the expressive power and analytical capabilities of graph databases, enabling seamless fusion of unstructured and structured data in ways previously unattainable. Through extensive experiments, we demonstrate TigerVector's hybrid search capability, scalability, and superior performance compared to other graph databases (including Neo4j and Amazon Neptune) and a highly optimized specialized vector database (Milvus). TigerVector was integrated into TigerGraph v4.2, the latest release of TigerGraph, in December 2024.
翻译:本文介绍了TigerVector系统,该系统在原生大规模并行处理(MPP)图数据库TigerGraph中集成了向量检索与图查询功能。我们通过引入嵌入类型扩展了顶点属性类型。为支持快速向量检索,我们设计了一个与图引擎高效协同的MPP索引框架。图查询语言GSQL得到增强,现支持向量类型表达式,并实现了向量检索结果与图查询块之间的组合查询。这些进展提升了图数据库的表达能力和分析性能,使得非结构化与结构化数据能够以前所未有的方式实现无缝融合。通过大量实验,我们验证了TigerVector相较于其他图数据库(包括Neo4j和Amazon Neptune)以及高度优化的专用向量数据库(Milvus)所具备的混合检索能力、可扩展性及卓越性能。TigerVector已于2024年12月集成至TigerGraph最新版本v4.2中。