In many experiment-driven scientific domains, such as high-energy physics, material science, and cosmology, high data rate experiments impose hard constraints on data acquisition systems: collected data must either be indiscriminately stored for post-processing and analysis, thereby necessitating large storage capacity, or accurately filtered in real-time, thereby necessitating low-latency processing. Deep neural networks, effective in other filtering tasks, have not been widely employed in such data acquisition systems, due to design and deployment difficulties. We present an open source, lightweight, compiler framework, without any proprietary dependencies, OpenHLS, based on high-level synthesis techniques, for translating high-level representations of deep neural networks to low-level representations, suitable for deployment to near-sensor devices such as field-programmable gate arrays. We evaluate OpenHLS on various workloads and present a case-study implementation of a deep neural network for Bragg peak detection in the context of high-energy diffraction microscopy. We show OpenHLS is able to produce an implementation of the network with a throughput 4.8 $\mu$s/sample, which is approximately a 4$\times$ improvement over the existing implementation
翻译:在许多实验驱动的科学领域,如高能物理、材料科学和宇宙学中,高数据率实验对数据采集系统施加了硬性约束:收集到的数据要么必须不加选择地存储以供后续处理和分析——这需要大量存储容量;要么必须实时精确过滤——这需要低延迟处理。深度神经网络在其他过滤任务中十分有效,却因设计与部署困难而尚未在数据采集系统中广泛应用。本文提出一个基于高层次综合技术的开源、轻量级、无专有依赖的编译器框架OpenHLS,用于将深度神经网络的高层表示转换为适合部署到现场可编程门阵列等近传感器设备的低层表示。我们在多种工作负载上对OpenHLS进行了评估,并展示了在高能衍射显微术背景下用于布拉格峰检测的深度神经网络案例实现。结果表明,OpenHLS能够生成吞吐量为4.8微秒/样本的网络实现,相较于现有实现性能提升约4倍。