EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval

In Composed Video Retrieval, a video and a textual description which modifies the video content are provided as inputs to the model. The aim is to retrieve the relevant video with the modified content from a database of videos. In this challenging task, the first step is to acquire large-scale training datasets and collect high-quality benchmarks for evaluation. In this work, we introduce EgoCVR, a new evaluation benchmark for fine-grained Composed Video Retrieval using large-scale egocentric video datasets. EgoCVR consists of 2,295 queries that specifically focus on high-quality temporal video understanding. We find that existing Composed Video Retrieval frameworks do not achieve the necessary high-quality temporal video understanding for this task. To address this shortcoming, we adapt a simple training-free method, propose a generic re-ranking framework for Composed Video Retrieval, and demonstrate that this achieves strong results on EgoCVR. Our code and benchmark are freely available at https://github.com/ExplainableML/EgoCVR.

翻译：在组合视频检索任务中，模型接收一个视频和一段用于修改视频内容的文本描述作为输入。其目标是从视频数据库中检索出经过修改内容后的相关视频。在这一具有挑战性的任务中，首要步骤是获取大规模训练数据集并收集高质量基准进行评估。本工作中，我们提出了EgoCVR，一个利用大规模以自我为中心视频数据集构建的、用于细粒度组合视频检索的全新评估基准。EgoCVR包含2,295个查询，特别侧重于高质量的时间维度视频理解。我们发现，现有的组合视频检索框架未能实现此任务所需的高质量时间维度视频理解。为弥补这一不足，我们采用了一种简单的免训练方法，提出了一个通用的组合视频检索重排序框架，并证明其在EgoCVR上取得了优异的结果。我们的代码与基准已开源：https://github.com/ExplainableML/EgoCVR。

相关内容

AIM

关注 660

医学人工智能AIM（Artificial Intelligence in Medicine）杂志发表了多学科领域的原创文章，涉及医学中的人工智能理论和实践，以医学为导向的人类生物学和卫生保健。医学中的人工智能可以被描述为与研究、项目和应用相关的科学学科，旨在通过基于知识或数据密集型的计算机解决方案支持基于决策的医疗任务，最终支持和改善人类护理提供者的性能。官网地址：http://dblp.uni-trier.de/db/journals/artmed/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日