Modern vision networks are dominated by additive local transformations, whereas explicit multiplicative local interactions remain underexplored. Product units offer a direct approach to modeling such interactions, but their use in deep architectures has been limited by optimization instability. In this work, we propose PURe, a Product-Unit Residual Module for deep vision networks. PURe is built around a 2D Product Unit with a real-valued log-domain formulation that makes multiplicative local aggregation practical within deep residual hierarchies. The resulting module serves as a drop-in replacement for native residual units. We instantiate PURe in residual CNNs for image classification and in 2D residual encoder-decoder networks for slice-based segmentation on volumetric CT data. Across Galaxy10 DECaLS, ImageNet, and CIFAR-10, PURe consistently improves residual CNNs and yields a more favorable accuracy-parameter trade-off, allowing moderately deep models to match or surpass substantially deeper ResNet baselines with much smaller parameter budgets. On the AMOS benchmark, PURe also improves slice-based CT segmentation under 3D case-level evaluation. These results show that explicit multiplicative local interaction is a practical and effective design primitive for deep residual vision networks.
翻译:现代视觉网络以加性局部变换为主导,而显式乘性局部交互仍未被充分探索。乘积单元为建模此类交互提供了直接途径,但其在深度架构中的应用受限于优化不稳定性。本文提出PURe——一种用于深度视觉网络的乘积单元残差模块。PURe基于二维乘积单元构建,采用实值对数域公式,使乘性局部聚合能够在深度残差层次结构中实际应用。所得模块可作为原生残差单元的即插即用替代品。我们将PURe实例化用于图像分类的残差CNN以及基于切片分割体素CT数据的二维残差编码器-解码器网络。在Galaxy10 DECaLS、ImageNet和CIFAR-10数据集上,PURe持续改进残差CNN,并获得更优的精度-参数权衡,使中等深度模型能以更小参数量匹配甚至超越更深的ResNet基线。在AMOS基准测试中,PURe也在三维病例级评估下提升了基于切片的CT分割性能。这些结果表明,显式乘性局部交互是深度残差视觉网络中一种实用且有效的设计原语。