DyRA: Portable Dynamic Resolution Adjustment Network for Existing Detectors

Achieving constant accuracy in object detection is challenging due to the inherent variability of object sizes. One effective approach to this problem involves optimizing input resolution, referred to as a multi-resolution strategy. Previous approaches to resolution optimization have often been based on pre-defined resolutions with manual selection. However, there is a lack of study on run-time resolution optimization for existing architectures. This paper introduces DyRA, a dynamic resolution adjustment network providing an image-specific scale factor for existing detectors. This network is co-trained with detectors utilizing specially designed loss functions, namely ParetoScaleLoss and BalanceLoss. ParetoScaleLoss determines an adaptive scale factor for robustness, while BalanceLoss optimizes overall scale factors according to the localization performance of the detector. The loss function is devised to minimize the accuracy drop across contrasting objectives of different-sized objects for scaling. Our proposed network can improve accuracy across various models, including RetinaNet, Faster-RCNN, FCOS, DINO, and H-Deformable-DETR. The code is available at https://github.com/DaEunFullGrace/DyRA.git.

翻译：物体检测中因目标尺寸的自然变化而实现恒定精度具有挑战性。一种有效方案是通过优化输入分辨率（即多分辨率策略）来解决该问题。此前分辨率优化方法多基于预设分辨率的人工选择，但现有架构的运行时分辨率优化研究尚存空白。本文提出DyRA（动态分辨率调整网络），可为现检测器提供图像特异性缩放因子。该网络通过专门设计的损失函数（ParetoScaleLoss和BalanceLoss）与检测器协同训练：ParetoScaleLoss通过自适应缩放因子增强鲁棒性，BalanceLoss则根据检测器的定位性能优化全局缩放因子。损失函数设计旨在最小化不同尺寸目标缩放时因目标差异导致的精度下降。实验表明，本网络可提升RetinaNet、Faster-RCNN、FCOS、DINO及H-Deformable-DETR等多种模型的检测精度。代码开源于https://github.com/DaEunFullGrace/DyRA.git。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日