In many experiment-driven scientific domains, such as high-energy physics, material science, and cosmology, high data rate experiments impose hard constraints on data acquisition systems: collected data must either be indiscriminately stored for post-processing and analysis, thereby necessitating large storage capacity, or accurately filtered in real-time, thereby necessitating low-latency processing. Deep neural networks, effective in other filtering tasks, have not been widely employed in such data acquisition systems, due to design and deployment difficulties. We present an open source, lightweight, compiler framework, without any proprietary dependencies, OpenHLS, based on high-level synthesis techniques, for translating high-level representations of deep neural networks to low-level representations, suitable for deployment to near-sensor devices such as field-programmable gate arrays. We evaluate OpenHLS on various workloads and present a case-study implementation of a deep neural network for Bragg peak detection in the context of high-energy diffraction microscopy. We show OpenHLS is able to produce an implementation of the network with a throughput 4.8 $\mu$s/sample, which is approximately a 4$\times$ improvement over the existing implementation
翻译:在许多实验驱动的科学领域(如高能物理、材料科学和宇宙学)中,高数据率实验对数据采集系统施加了严格约束:采集的数据必须要么被不加筛选地存储以供后续处理和分析(从而需要大容量存储),要么被实时精确过滤(从而需要低延迟处理)。深度神经网络在其它过滤任务中表现有效,但由于设计和部署的困难,尚未被广泛用于此类数据采集系统。我们提出一个基于高层次综合技术的开源、轻量级、无专有依赖的编译器框架OpenHLS,用于将深度神经网络的高层表示翻译为低层表示,使其适合部署到场可编程门阵列等近传感器设备。我们在多种工作负载上评估OpenHLS,并展示一个用于高能衍射显微术中布拉格峰检测的深度神经网络案例实现。结果表明,OpenHLS能够生成吞吐量为每样本4.8微秒的网络实现,较现有实现提升了约4倍。