Existing disaggregated databases separate execution and storage layers, enabling independent and elastic scaling of resources. In most cases, this design makes transaction concurrency control (CC) a critical bottleneck, which demands significant computing resources for concurrent conflict management and struggles to scale due to the coordination overhead for concurrent conflict resolution. Coupling CC with execution or storage limits performance and elasticity, as CC's resource needs do not align with the free scaling of the transaction execution layer or the storage-bound data layer. This paper proposes Concurrency Control as a Service (CCaaS), which decouples CC from databases, building an execution-CC-storage three-layer decoupled database, allowing independent scaling and upgrades for improved elasticity, resource utilization, and development agility. However, adding a new layer increases latency due to the shift in communication from hardware to network. To address this, we propose a Sharded Multi-Write OCC (SM-OCC) algorithm with an asynchronous log push-down mechanism to minimize network communications overhead and transaction latency. Additionally, we implement a multi-write architecture with a deterministic conflict resolution method to reduce coordination overhead in the CC layer, thereby improving scalability. CCaaS is designed to be connected by a variety of execution and storage engines. Existing disaggregated databases can be revolutionized with CCaaS to achieve high elasticity, scalability, and high performance. Results show that CCaaS achieves 1.02-3.11X higher throughput and 1.11-2.75X lower latency than SoTA disaggregated databases.
翻译:现有的解耦数据库将执行层与存储层分离,实现了资源的独立弹性扩展。在多数情况下,这种设计使得事务并发控制成为关键瓶颈,其需要大量计算资源以管理并发冲突,并且由于并发冲突解决所需的协调开销而难以扩展。将并发控制与执行层或存储层耦合会限制性能与弹性,因为并发控制的资源需求与事务执行层的自由扩展或存储受限的数据层并不匹配。本文提出并发控制即服务,将并发控制从数据库中解耦,构建执行-并发控制-存储三层解耦数据库,从而实现独立扩展与升级,以提升弹性、资源利用率和开发敏捷性。然而,新增的层会因通信从硬件转向网络而增加延迟。为解决此问题,我们提出一种分片多写乐观并发控制算法,并采用异步日志下推机制,以最小化网络通信开销和事务延迟。此外,我们实现了一种多写架构,结合确定性冲突解决方法,以减少并发控制层的协调开销,从而提升可扩展性。CCaaS 设计为可连接多种执行与存储引擎。现有的解耦数据库可通过引入 CCaaS 实现高弹性、高可扩展性与高性能。实验结果表明,相较于最先进的解耦数据库,CCaaS 实现了 1.02-3.11 倍的吞吐量提升和 1.11-2.75 倍的延迟降低。