TSalV360: A Method and Dataset for Text-driven Saliency Detection in 360-Degrees Videos

In this paper, we deal with the task of text-driven saliency detection in 360-degrees videos. For this, we introduce the TSV360 dataset which includes 16,000 triplets of ERP frames, textual descriptions of salient objects/events in these frames, and the associated ground-truth saliency maps. Following, we extend and adapt a SOTA visual-based approach for 360-degrees video saliency detection, and develop the TSalV360 method that takes into account a user-provided text description of the desired objects and/or events. This method leverages a SOTA vision-language model for data representation and integrates a similarity estimation module and a viewport spatio-temporal cross-attention mechanism, to discover dependencies between the different data modalities. Quantitative and qualitative evaluations using the TSV360 dataset, showed the competitiveness of TSalV360 compared to a SOTA visual-based approach and documented its competency to perform customized text-driven saliency detection in 360-degrees videos.

翻译：本文研究360度视频中的文本驱动显著性检测任务。为此，我们提出了TSV360数据集，该数据集包含16,000组三元数据：ERP帧、对这些帧中显著物体/事件的文本描述，以及对应的真实显著性标注图。随后，我们扩展并适配了一种基于视觉的360度视频显著性检测先进方法，开发出TSalV360方法。该方法能够考虑用户提供的关于目标物体和/或事件的文本描述，利用先进的视觉-语言模型进行数据表征，并集成相似度估计模块与视口时空交叉注意力机制，以发掘不同数据模态间的依赖关系。基于TSV360数据集的定量与定性评估表明，相较于先进的纯视觉方法，TSalV360具有竞争力，并验证了其在360度视频中执行定制化文本驱动显著性检测的能力。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日