Recent progress in computer vision-oriented neural network designs is mostly driven by capturing high-order neural interactions among inputs and features. And there emerged a variety of approaches to accomplish this, such as Transformers and its variants. However, these interactions generate a large amount of intermediate state and/or strong data dependency, leading to considerable memory consumption and computing cost, and therefore compromising the overall runtime performance. To address this challenge, we rethink the high-order interactive neural network design with a quadratic computing approach. Specifically, we propose QuadraNet -- a comprehensive model design methodology from neuron reconstruction to structural block and eventually to the overall neural network implementation. Leveraging quadratic neurons' intrinsic high-order advantages and dedicated computation optimization schemes, QuadraNet could effectively achieve optimal cognition and computation performance. Incorporating state-of-the-art hardware-aware neural architecture search and system integration techniques, QuadraNet could also be well generalized in different hardware constraint settings and deployment scenarios. The experiment shows thatQuadraNet achieves up to 1.5$\times$ throughput, 30% less memory footprint, and similar cognition performance, compared with the state-of-the-art high-order approaches.
翻译:摘要:近期基于计算机视觉的神经网络设计进展主要受捕获输入与特征间高阶神经交互驱动。为此涌现了多种研究方法,例如Transformer及其变体。然而,这些交互会生成大量中间状态和/或强数据依赖性,导致显著的内存开销与计算成本,从而影响整体运行时性能。为解决这一挑战,我们采用二次计算方法重新思考高阶交互式神经网络设计。具体而言,我们提出QuadraNet——一种从神经元重构到结构模块构建,最终延伸至完整神经网络实现的系统性模型设计方法论。利用二次神经元固有的高阶优势与专用计算优化方案,QuadraNet能够有效实现最优的认知与计算性能。融合最先进的硬件感知神经架构搜索与系统集成技术后,QuadraNet还可良好适配不同硬件约束设置与部署场景。实验表明,与最先进的高阶方法相比,QuadraNet实现了高达1.5倍的吞吐量、降低30%的内存占用,并保持相近的认知性能。