Hardware-Aware Static Optimization of Hyperdimensional Computations

Binary spatter code (BSC)-based hyperdimensional computing (HDC) is a highly error-resilient approximate computational paradigm suited for error-prone, emerging hardware platforms. In BSC HDC, the basic datatype is a hypervector, a typically large binary vector, where the size of the hypervector has a significant impact on the fidelity and resource usage of the computation. Typically, the hypervector size is dynamically tuned to deliver the desired accuracy; this process is time-consuming and often produces hypervector sizes that lack accuracy guarantees and produce poor results when reused for very similar workloads. We present Heim, a hardware-aware static analysis and optimization framework for BSC HD computations. Heim analytically derives the minimum hypervector size that minimizes resource usage and meets the target accuracy requirement. Heim guarantees the optimized computation converges to the user-provided accuracy target on expectation, even in the presence of hardware error. Heim deploys a novel static analysis procedure that unifies theoretical results from the neuroscience community to systematically optimize HD computations. We evaluate Heim against dynamic tuning-based optimization on 25 benchmark data structures. Given a 99% accuracy requirement, Heim-optimized computations achieve a 99.2%-100.0% median accuracy, up to 49.5% higher than dynamic tuning-based optimization, while achieving 1.15x-7.14x reductions in hypervector size compared to HD computations that achieve comparable query accuracy and finding parametrizations 30.0x-100167.4x faster than dynamic tuning-based approaches. We also use Heim to systematically evaluate the performance benefits of using analog CAMs and multiple-bit-per-cell ReRAM over conventional hardware, while maintaining iso-accuracy -- for both emerging technologies, we find usages where the emerging hardware imparts significant benefits.

翻译：基于二进制散射码的超维计算是一种高容错的近似计算范式，适用于易出错的新兴硬件平台。在二进制散射码超维计算中，基本数据类型是超向量——一种通常为大规模的二进制向量，其规模对计算的保真度和资源消耗有显著影响。通常，超向量规模通过动态调整以满足目标精度；但该过程耗时且往往生成的超向量规模缺乏精度保证，当复用于高度相似的工作负载时效果不佳。我们提出Heim，一种面向二进制散射码超维计算的硬件感知静态分析与优化框架。Heim通过解析方法推导出最小化资源占用且满足目标精度需求的最小超向量规模。即使存在硬件错误，Heim也能保证优化后的计算在期望值上收敛至用户指定的精度目标。Heim采用一种新颖的静态分析流程，统一了神经科学领域的理论成果，系统性地优化超维计算。我们针对25个基准数据结构，将Heim与基于动态调优的优化方法进行对比评估。在99%精度要求下，Heim优化后的计算实现99.2%-100.0%的中位精度，比基于动态调优的优化方法最高提升49.5%；同时，与达到相当查询精度的超维计算相比，超向量规模缩减1.15-7.14倍，参数化效率达到动态调优方法的30.0-100167.4倍。我们还利用Heim系统评估了在保持等精度条件下，使用模拟内容寻址存储器和多位单元电阻式随机存取存储器相较于传统硬件的性能优势——针对这两种新兴技术，我们均发现了新兴硬件带来显著收益的应用场景。