OVR: A Dataset for Open Vocabulary Temporal Repetition Counting in Videos

We introduce a dataset of annotations of temporal repetitions in videos. The dataset, OVR (pronounced as over), contains annotations for over 72K videos, with each annotation specifying the number of repetitions, the start and end time of the repetitions, and also a free-form description of what is repeating. The annotations are provided for videos sourced from Kinetics and Ego4D, and consequently cover both Exo and Ego viewing conditions, with a huge variety of actions and activities. Moreover, OVR is almost an order of magnitude larger than previous datasets for video repetition. We also propose a baseline transformer-based counting model, OVRCounter, that can localise and count repetitions in videos that are up to 320 frames long. The model is trained and evaluated on the OVR dataset, and its performance assessed with and without using text to specify the target class to count. The performance is also compared to a prior repetition counting model. The dataset is available for download at: https://sites.google.com/view/openvocabreps/

翻译：我们提出了一个视频时序重复标注数据集。该数据集命名为OVR（发音同"over"），包含超过7.2万条视频标注，每条标注均明确标注了重复次数、重复片段的起止时间，并对重复内容提供了自由形式的描述。标注视频来源于Kinetics和Ego4D数据集，因此同时涵盖了外视（Exo）与第一人称（Ego）两种视角条件，并包含了极其丰富多样的动作与活动类别。此外，OVR的规模较以往视频重复数据集提升了近一个数量级。我们还提出了一个基于Transformer的基准计数模型OVRCounter，该模型能够对最长320帧的视频进行重复动作的定位与计数。我们在OVR数据集上对该模型进行了训练与评估，并分别测试了使用文本指定计数目标类别与不使用文本条件下的性能表现。同时，我们将该模型与先前的重复计数模型进行了性能对比。本数据集可通过以下链接下载：https://sites.google.com/view/openvocabreps/

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日