Modern data architectures are fragmented across graph databases, vector stores, analytics engines, and optimization solvers, resulting in complex ETL pipelines and synchronization overhead. We present \textbf{Samyama}, a high-performance graph-vector database written in Rust that unifies these workloads into a single engine. Samyama combines a RocksDB-backed persistent store with a versioned-arena MVCC model, a vectorized query executor with 35 physical operators, a cost-based query planner with plan enumeration and predicate pushdown, a dedicated CSR-based analytics engine, and native RDF/SPARQL support. The system integrates 22 metaheuristic optimization solvers directly into its query language, implements HNSW vector indexing~\citep{malkov2020hnsw} with Graph RAG capabilities, and introduces ``Agentic Enrichment'' for autonomous graph expansion via LLMs. The \textbf{Enterprise Edition} adds GPU acceleration via wgpu, production-grade observability, point-in-time recovery, and hardened high availability with HTTP/2 Raft transport. Our evaluation on commodity hardware (Mac Mini M4, 16\,GB RAM) demonstrates: ingestion at 255K nodes/s (CPU) and 412K nodes/s (GPU-accelerated); 115K Cypher queries/sec at 1M nodes; 4.0--4.7$\times$ latency reduction from late materialization on multi-hop traversals; 8.2$\times$ GPU PageRank speedup at 1M nodes; and 100\% LDBC Graphalytics validation (28/28 tests). These results demonstrate that a unified graph-vector-optimization engine can achieve competitive performance on commodity hardware while maintaining Rust's memory safety guarantees.
翻译:现代数据架构在图形数据库、向量存储、分析引擎和优化求解器之间呈现碎片化,导致复杂的ETL管道和同步开销。本文提出\textbf{Samyama},一种用Rust编写的高性能图向量数据库,它将上述工作负载统一整合至单一引擎中。Samyama将基于RocksDB的持久化存储与版本化竞技场MVCC模型、包含35个物理算子的向量化查询执行器、支持计划枚举和谓词下推的基于代价的查询规划器、专用的基于CSR的分析引擎以及原生的RDF/SPARQL支持相结合。该系统将22种元启发式优化求解器直接集成到其查询语言中,实现了具备Graph RAG功能的HNSW向量索引~\citep{malkov2020hnsw},并引入了通过大语言模型实现自主图扩展的“智能增强”机制。\textbf{企业版}通过wgpu增加了GPU加速、生产级可观测性、时间点恢复以及采用HTTP/2 Raft传输的强化高可用性。我们在商用硬件(Mac Mini M4,16GB内存)上的评估显示:数据摄取速度达255K节点/秒(CPU)和412K节点/秒(GPU加速);在100万节点规模下支持115K次Cypher查询/秒;多跳遍历中延迟物化带来4.0–4.7$\times$的延迟降低;在100万节点规模下GPU加速的PageRank实现8.2$\times$的速度提升;并通过了LDBC Graphalytics全部验证(28/28项测试)。这些结果表明,统一的图-向量-优化引擎能够在保持Rust内存安全保证的同时,在商用硬件上实现具有竞争力的性能。