Hardware-Aware Neural Dropout Search for Reliable Uncertainty Prediction on FPGA

The increasing deployment of artificial intelligence (AI) for critical decision-making amplifies the necessity for trustworthy AI, where uncertainty estimation plays a pivotal role in ensuring trustworthiness. Dropout-based Bayesian Neural Networks (BayesNNs) are prominent in this field, offering reliable uncertainty estimates. Despite their effectiveness, existing dropout-based BayesNNs typically employ a uniform dropout design across different layers, leading to suboptimal performance. Moreover, as diverse applications require tailored dropout strategies for optimal performance, manually optimizing dropout configurations for various applications is both error-prone and labor-intensive. To address these challenges, this paper proposes a novel neural dropout search framework that automatically optimizes both the dropout-based BayesNNs and their hardware implementations on FPGA. We leverage one-shot supernet training with an evolutionary algorithm for efficient dropout optimization. A layer-wise dropout search space is introduced to enable the automatic design of dropout-based BayesNNs with heterogeneous dropout configurations. Extensive experiments demonstrate that our proposed framework can effectively find design configurations on the Pareto frontier. Compared to manually-designed dropout-based BayesNNs on GPU, our search approach produces FPGA designs that can achieve up to 33X higher energy efficiency. Compared to state-of-the-art FPGA designs of BayesNN, the solutions from our approach can achieve higher algorithmic performance and energy efficiency.

翻译：人工智能在关键决策领域的日益广泛应用，显著增加了对可信人工智能的需求，其中不确定性估计在确保可信度方面起着关键作用。基于Dropout的贝叶斯神经网络在该领域中表现突出，能够提供可靠的不确定性估计。尽管其效果显著，现有基于Dropout的贝叶斯神经网络通常在不同层采用统一的Dropout设计，导致性能未能达到最优。此外，由于不同应用需要定制化的Dropout策略以实现最佳性能，为各类应用手动优化Dropout配置既容易出错又耗费人力。为应对这些挑战，本文提出了一种新颖的神经Dropout搜索框架，能够自动优化基于Dropout的贝叶斯神经网络及其在FPGA上的硬件实现。我们采用一次性超网络训练结合进化算法，以实现高效的Dropout优化。通过引入分层Dropout搜索空间，该框架能够自动设计具有异构Dropout配置的贝叶斯神经网络。大量实验表明，所提出的框架能有效找到帕累托前沿上的设计配置。与GPU上人工设计的基于Dropout的贝叶斯神经网络相比，我们的搜索方法产生的FPGA设计能实现高达33倍的能效提升。与最先进的贝叶斯神经网络FPGA设计相比，本方法获得的解决方案在算法性能和能效方面均表现更优。