Zero-Knowledge Proofs (ZKPs) are critical for privacy-preserving techniques and verifiable computation. Many ZKP protocols rely on key kernels such as the SumCheck protocol and Merkle Tree commitments to enable their key security properties. These kernels exhibit balanced binary tree computational patterns, which enable efficient hardware acceleration. Although prior work has investigated accelerating these kernels as part of an overarching ZKP protocol, exploiting this common tree pattern remains relatively underexplored. We conduct a systematic evaluation of these tree-based workloads under different traversal strategies, analyzing performance on multi-threaded CPUs and the Multifunction Tree Unit (MTU) hardware accelerator. We introduce a hardware-friendly Hybrid Traversal for binary tree that improves parallelism and scalability while significantly reducing memory traffic on hardware. Our results show that MTU achieves up to $1478\times$ speedup over CPU at DDR-level bandwidth and that our hybrid traversal outperforms breadth-first search by up to $3\times$. These findings offer practical guidance for designing efficient hardware accelerators for ZKP workloads with binary tree structures.
翻译:零知识证明(ZKPs)对于隐私保护技术和可验证计算至关重要。许多ZKP协议依赖于SumCheck协议和Merkle Tree承诺等核心内核来实现其关键安全特性。这些内核呈现出平衡二叉树的计算模式,这为硬件加速提供了可能。尽管先前的研究已尝试将这些内核作为整体ZKP协议的一部分进行加速,但针对这一通用树形模式的深度优化仍相对不足。本文系统评估了不同遍历策略下基于树形结构的工作负载,分析了其在多线程CPU和多功能树单元(MTU)硬件加速器上的性能。我们提出了一种硬件友好的二叉树混合遍历方法,该方法在显著降低硬件内存流量的同时,提升了并行性与可扩展性。实验结果表明,在DDR级别带宽下,MTU相比CPU可实现高达$1478\times$的加速比,且我们提出的混合遍历方法性能较广度优先搜索提升最高达$3\times$。这些发现为设计面向二叉树结构ZKP工作负载的高效硬件加速器提供了实用指导。