MSVCOD:A Large-Scale Multi-Scene Dataset for Video Camouflage Object Detection

Video Camouflaged Object Detection (VCOD) is a challenging task which aims to identify objects that seamlessly concealed within the background in videos. The dynamic properties of video enable detection of camouflaged objects through motion cues or varied perspectives. Previous VCOD datasets primarily contain animal objects, limiting the scope of research to wildlife scenarios. However, the applications of VCOD extend beyond wildlife and have significant implications in security, art, and medical fields. Addressing this problem, we construct a new large-scale multi-domain VCOD dataset MSVCOD. To achieve high-quality annotations, we design a semi-automatic iterative annotation pipeline that reduces costs while maintaining annotation accuracy. Our MSVCOD is the largest VCOD dataset to date, introducing multiple object categories including human, animal, medical, and vehicle objects for the first time, while also expanding background diversity across various environments. This expanded scope increases the practical applicability of the VCOD task in camouflaged object detection. Alongside this dataset, we introduce a one-steam video camouflage object detection model that performs both feature extraction and information fusion without additional motion feature fusion modules. Our framework achieves state-of-the-art results on the existing VCOD animal dataset and the proposed MSVCOD. The dataset and code will be made publicly available.

翻译：视频伪装目标检测（VCOD）是一项具有挑战性的任务，其目标在于识别视频中与背景无缝融合的隐藏物体。视频的动态特性使得通过运动线索或多变视角检测伪装物体成为可能。现有的VCOD数据集主要包含动物对象，将研究范围局限于野生动物场景。然而，VCOD的应用远不止于野生动物领域，在安防、艺术和医疗等领域同样具有重要价值。针对这一问题，我们构建了一个新的大规模多领域VCOD数据集MSVCOD。为实现高质量标注，我们设计了一套半自动迭代标注流程，在保证标注精度的同时降低了成本。我们的MSVCOD是迄今为止规模最大的VCOD数据集，首次引入了包括人体、动物、医疗和车辆在内的多类目标对象，同时扩展了多种环境下的背景多样性。这种范围的拓展显著提升了VCOD任务在伪装目标检测中的实际适用性。伴随该数据集，我们提出了一种单流视频伪装目标检测模型，该模型无需额外的运动特征融合模块即可同时完成特征提取与信息融合。我们的框架在现有VCOD动物数据集及所提出的MSVCOD上均取得了最先进的性能。数据集与代码将公开发布。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日