HIDA: A Hierarchical Dataflow Compiler for High-Level Synthesis

Dataflow architectures are growing in popularity due to their potential to mitigate the challenges posed by the memory wall inherent to the Von Neumann architecture. At the same time, high-level synthesis (HLS) has demonstrated its efficacy as a design methodology for generating efficient dataflow architectures within a short development cycle. However, existing HLS tools rely on developers to explore the vast dataflow design space, ultimately leading to suboptimal designs. This phenomenon is especially concerning as the size of the HLS design grows. To tackle these challenges, we introduce HIDA, a new scalable and hierarchical HLS framework that can systematically convert an algorithmic description into a dataflow implementation on hardware. We first propose a collection of efficient and versatile dataflow representations for modeling the hierarchical dataflow structure. Capitalizing on these representations, we develop an automated optimizer that decomposes the dataflow optimization problem into multiple levels based on the inherent dataflow hierarchy. Using FPGAs as an evaluation platform, working with a set of neural networks modeled in PyTorch, HIDA achieves up to 8.54$\times$ higher throughput compared to the state-of-the-art (SOTA) HLS optimization tool. Furthermore, despite being fully automated and able to handle various applications, HIDA achieves 1.29$\times$ higher throughput over the SOTA RTL-based neural network accelerators on an FPGA.

翻译：数据流架构因其在缓解冯·诺依曼架构固有的“存储墙”挑战方面的潜力而日益普及。同时，高层次综合（HLS）已被证明是一种在短开发周期内生成高效数据流架构的有效设计方法。然而，现有HLS工具依赖开发者探索庞大的数据流设计空间，最终导致设计次优。随着HLS设计规模的增长，这一问题尤为突出。为应对这些挑战，我们提出了HIDA——一种新型可扩展的层次化HLS框架，能够系统地将算法描述转换为硬件上的数据流实现。我们首先提出了一组高效且通用的数据流表示方法，用于建模层次化数据流结构。基于这些表示，我们开发了一个自动化优化器，根据数据流的固有层次将数据流优化问题分解为多个层级。以FPGA作为评估平台，结合一组用PyTorch建模的神经网络，HIDA相比最先进的HLS优化工具实现了最高8.54×的吞吐量提升。此外，尽管完全自动化且能处理多种应用，HIDA在FPGA上相较于最先进的基于RTL的神经网络加速器仍实现了1.29×的吞吐量提升。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日