Model compression is essential in the deployment of large Computer Vision models on embedded devices. However, static optimization techniques (e.g. pruning, quantization, etc.) neglect the fact that different inputs have different complexities, thus requiring different amount of computations. Dynamic Neural Networks allow to condition the number of computations to the specific input. The current literature on the topic is very extensive and fragmented. We present a comprehensive survey that synthesizes and unifies existing Dynamic Neural Networks research in the context of Computer Vision. Additionally, we provide a logical taxonomy based on which component of the network is adaptive: the output, the computation graph or the input. Furthermore, we argue that Dynamic Neural Networks are particularly beneficial in the context of Sensor Fusion for better adaptivity, noise reduction and information prioritization. We present preliminary works in this direction. We complement this survey with a curated repository listing all the surveyed papers, each with a brief summary of the solution and the code base when available: https://github.com/DTU-PAS/awesome-dynn-for-cv .
翻译:模型压缩对于在嵌入式设备上部署大型计算机视觉模型至关重要。然而,静态优化技术(如剪枝、量化等)忽略了不同输入具有不同复杂度,因而需要不同计算量的事实。动态神经网络能够根据具体输入调整计算量。当前该主题的文献非常广泛且零散。本文提出了一项全面的综述,在计算机视觉背景下综合并统一了现有的动态神经网络研究。此外,我们提供了一个基于网络自适应组件(输出、计算图或输入)的逻辑分类法。进一步地,我们认为动态神经网络在传感器融合领域尤其有益,可实现更好的适应性、噪声降低和信息优先级处理。我们介绍了该方向的初步研究工作。我们为本次综述配套维护了一个精选资源库,列出了所有被评述的论文,每篇均附有解决方案的简要说明及可用的代码库:https://github.com/DTU-PAS/awesome-dynn-for-cv 。