In many experiment-driven scientific domains, such as high-energy physics, material science, and cosmology, high data rate experiments impose hard constraints on data acquisition systems: collected data must either be indiscriminately stored for post-processing and analysis, thereby necessitating large storage capacity, or accurately filtered in real-time, thereby necessitating low-latency processing. Deep neural networks, effective in other filtering tasks, have not been widely employed in such data acquisition systems, due to design and deployment difficulties. We present an open source, lightweight, compiler framework, without any proprietary dependencies, OpenHLS, based on high-level synthesis techniques, for translating high-level representations of deep neural networks to low-level representations, suitable for deployment to near-sensor devices such as field-programmable gate arrays. We evaluate OpenHLS on various workloads and present a case-study implementation of a deep neural network for Bragg peak detection in the context of high-energy diffraction microscopy. We show OpenHLS is able to produce an implementation of the network with a throughput 4.8 $\mu$s/sample, which is approximately a 4$\times$ improvement over the existing implementation
翻译:在许多实验驱动的科学领域(如高能物理、材料科学和宇宙学)中,高数据速率实验对数据采集系统施加了严格限制:采集的数据必须要么不加区分地存储以供后期处理和分析(从而需要大容量存储),要么进行实时精确过滤(从而需要低延迟处理)。深度神经网络在其他过滤任务中效果显著,但因设计和部署困难,尚未被广泛应用于此类数据采集系统。我们提出一个基于高层次综合技术的开源轻量级编译器框架OpenHLS,它无需任何专有依赖,可将深度神经网络的高级表示转换为适合部署到近传感器设备(如现场可编程门阵列)的低级表示。我们在多种工作负载上评估了OpenHLS,并针对高能衍射显微学中的布拉格峰检测任务,呈现了一个深度神经网络的案例研究实现。结果表明,OpenHLS能以4.8微秒/样本的吞吐量生成该网络的实现,相比现有实现提升了约4倍。