Hyperdimensional Computing (HDC) is a bio-inspired computing framework that has gained increasing attention, especially as a more efficient approach to machine learning (ML). This work introduces the \name{} compiler, the first open-source compiler that translates high-level descriptions of HDC classification methods into optimized C code. The code generated by the proposed compiler has three main features for embedded systems and High-Performance Computing: (1) it is self-contained and has no library or platform dependencies; (2) it supports multithreading and single instruction multiple data (SIMD) instructions using C intrinsics; (3) it is optimized for maximum performance and minimal memory usage. \name{} is designed like a modern compiler, featuring an intuitive and descriptive input language, an intermediate representation (IR), and a retargetable backend. This makes \name{} a valuable tool for research and applications exploring HDC for classification tasks on embedded systems and High-Performance Computing. To substantiate these claims, we conducted experiments with HDCC on several of the most popular datasets in the HDC literature. The experiments were run on four different machines, including different hyperparameter configurations, and the results were compared to a popular prototyping library built on PyTorch. The results show a training and inference speedup of up to 132x, averaging 25x across all datasets and machines. Regarding memory usage, using 10240-dimensional hypervectors, the average reduction was 5x, reaching up to 14x. When considering vectors of 64 dimensions, the average reduction was 85x, with a maximum of 158x less memory utilization.
翻译:超维度计算(Hyperdimensional Computing, HDC)是一种受生物启发的计算框架,尤其在作为机器学习的高效替代方案方面日益受到关注。本文介绍 \name{} 编译器——首个将HDC分类方法的高层描述转化为优化C代码的开源编译器。该编译器生成的代码针对嵌入式系统与高性能计算具备三大特征:(1)自包含且无库或平台依赖;(2)通过C语言内建函数支持多线程与单指令多数据流(SIMD)指令;(3)针对最大性能与最小内存占用进行优化。\name{} 采用现代编译器设计范式,配备直观描述性输入语言、中间表示(IR)及可重定向后端,这使其成为探索嵌入式系统与高性能计算中HDC分类任务的研究与应用利器。为验证上述特性,我们在HDC文献中多个最流行数据集上开展实验,涉及四类不同机器及多种超参数配置,并将结果与基于PyTorch的主流原型库进行对比。实验显示,训练与推理速度最高提升132倍,所有数据集与机器平均加速25倍;内存占用方面,使用10240维超向量时平均减少5倍(最高达14倍),采用64维向量时平均减少85倍(最高达158倍)。