LVOS: A Benchmark for Long-term Video Object Segmentation

Existing video object segmentation (VOS) benchmarks focus on short-term videos which just last about 3-5 seconds and where objects are visible most of the time. These videos are poorly representative of practical applications, and the absence of long-term datasets restricts further investigation of VOS on the application in realistic scenarios. So, in this paper, we present a new benchmark dataset named \textbf{LVOS}, which consists of 220 videos with a total duration of 421 minutes. To the best of our knowledge, LVOS is the first densely annotated long-term VOS dataset. The videos in our LVOS last 1.59 minutes on average, which is 20 times longer than videos in existing VOS datasets. Each video includes various attributes, especially challenges deriving from the wild, such as long-term reappearing and cross-temporal similar objeccts.Based on LVOS, we assess existing video object segmentation algorithms and propose a Diverse Dynamic Memory network (DDMemory) that consists of three complementary memory banks to exploit temporal information adequately. The experimental results demonstrate the strength and weaknesses of prior methods, pointing promising directions for further study. Data and code are available at https://lingyihongfd.github.io/lvos.github.io/.

翻译：现有视频目标分割基准数据集主要针对时长仅约3-5秒且目标在大多数时间内可见的短视频。这些视频难以代表实际应用场景，而长期数据集的缺失限制了视频目标分割在真实场景中应用的研究。为此，本文提出名为**LVOS**的全新基准数据集，包含220个视频片段，总时长达421分钟。据我们所知，LVOS是首个密集标注的长期视频目标分割数据集。数据集中视频平均时长为1.59分钟，是现有视频目标分割数据集的20倍。每个视频包含多种属性，尤其是来自野外的挑战性特征，如长期重现和目标跨时间相似性。基于LVOS，我们评估了现有视频目标分割算法，并提出一种由三个互补存储库组成的分集动态存储网络以充分利用时序信息。实验结果表明了现有方法的优势与不足，为后续研究指明了方向。数据和代码开源地址：https://lingyihongfd.github.io/lvos.github.io/。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日