The TREC Video Retrieval Evaluation (TRECVID) is a TREC-style video analysis and retrieval evaluation with the goal of promoting progress in research and development of content-based exploitation and retrieval of information from digital video via open, tasks-based evaluation supported by metrology. Over the last twenty-one years this effort has yielded a better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. TRECVID has been funded by NIST (National Institute of Standards and Technology) and other US government agencies. In addition, many organizations and individuals worldwide contribute significant time and effort. TRECVID 2022 planned for the following six tasks: Ad-hoc video search, Video to text captioning, Disaster scene description and indexing, Activity in extended videos, deep video understanding, and movie summarization. In total, 35 teams from various research organizations worldwide signed up to join the evaluation campaign this year. This paper introduces the tasks, datasets used, evaluation frameworks and metrics, as well as a high-level results overview.
翻译:TREC视频检索评测(TRECVID)是一项遵循TREC框架的视频分析与检索评测任务,旨在通过基于度量标准支持的开放式任务评测,促进基于内容的数字视频信息开发与检索技术的研究进展。过去二十一年间,该工作加深了人们对系统如何有效完成此类处理过程以及如何可靠评估其性能的理解。TRECVID由美国国家标准与技术研究院(NIST)及其他美国政府机构资助,同时全球众多组织与个人也投入了大量时间与精力。TRECVID 2022计划开展以下六项任务:即席视频搜索、视频到文本描述、灾难场景描述与索引、扩展视频中的活动识别、深度视频理解以及电影摘要。本年度共有来自全球各研究机构的35支团队注册参与评测。本文介绍了相关任务、使用的数据集、评估框架与指标,并提供了高水平的结果概述。