This paper examines the use of Quantized Neural Networks (QNNs) for two resource-constrained scientific applications: automated calibration of semi-conductor quantum bits (qubits) and scientific particle detectors. We evaluate the trade-offs between Post-Training Quantization (PTQ), Quantization-Aware Training (QAT), and ultra-low-bit Binary Neural Networks (BNNs) with respect to latency and resource usage. Our results demonstrate that PTQ achieves a four-fold reduction in memory usage for U-shaped CNN (U-Net) architectures while maintaining or slightly enhancing segmentation accuracy (e.g. from 89% to 90% for a small U-Net with 447 parameters). For the training of non-differentiable custom BNNs , we propose a novel, hardware-constrained learning approach using Genetic Algorithms (GAs). We showcase a LUT-based BNN architecture suitable for direct conversion to VHDL via the HCL4BNN framework. This method achieves nanosecond-scale inference latencies (10-15 ns) without requiring specialized DSP or BRAM resources.
翻译:本文探讨了量化神经网络(QNNs)在两种资源受限的科学应用场景中的使用:半导体量子比特(qubits)的自动校准与科学粒子探测器。我们评估了训练后量化(PTQ)、量化感知训练(QAT)以及超低比特二元神经网络(BNNs)在延迟和资源使用方面的权衡。研究结果表明,PTQ在U型卷积神经网络(U-Net)架构中实现了四倍的内存占用降低,同时保持或略微提升分割精度(例如,对于含447个参数的小型U-Net,精度从89%提升至90%)。针对不可微的定制BNN训练,我们提出了一种基于遗传算法(GAs)的硬件约束新型学习方法。我们展示了一种基于查找表(LUT)的BNN架构,该架构可通过HCL4BNN框架直接转换为VHDL。该方法无需专用DSP或BRAM资源,即可实现纳秒级推理延迟(10-15纳秒)。