Computational offload to hardware accelerators is gaining traction due to increasing computational demands and efficiency challenges. Programmable hardware, like FPGAs, offers a promising platform in rapidly evolving application areas, with the benefits of hardware acceleration and software programmability. Unfortunately, such systems composed of multiple hardware components must consider integrity in the case of malicious components. In this work, we propose Samsara, the first secure and resilient platform that derives, from Byzantine Fault Tolerant (BFT), protocols to enhance the computing resilience of programmable hardware. Samsara uses a novel lightweight hardware-based BFT protocol for Systems-on-Chip, called H-Quorum, that implements the theoretical-minimum latency between applications and replicated compute nodes. To withstand malicious behaviors, Samsara supports hardware rejuvenation, which is used to replace, relocate, or diversify faulty compute nodes. Samsara's architecture ensures the security of the entire workflow while keeping the latency overhead, of both computation and rejuvenation, close to the non-replicated counterpart.
翻译:由于计算需求增长和效率挑战,向硬件加速器的计算卸载正日益受到关注。可编程硬件(如FPGA)凭借硬件加速与软件可编程的双重优势,在快速演进的应用领域中展现出广阔前景。然而,这种由多个硬件组件构成的系统必须考虑恶意组件存在时的完整性保障问题。本研究提出Samsara平台——首个基于拜占庭容错(BFT)协议构建的安全弹性平台,旨在增强可编程硬件的计算弹性。Samsara采用名为H-Quorum的新型轻量级硬件BFT协议,为片上系统实现了应用与复制计算节点间理论最小延迟。为抵御恶意行为,Samsara支持硬件再生机制,可对故障计算节点进行替换、重定位或多样化处理。该架构在保障全工作流安全性的同时,将计算与再生过程的延迟开销控制在接近非复制系统的水平。