Over the years, many subjective and objective quality assessment datasets have been created and made available to the research community. However, there is no standard process for documenting the various aspects of the dataset, such as details about the source sequences, number of test subjects, test methodology, encoding settings, etc. Such information is often of great importance to the users of the dataset as it can help them get a quick understanding of the motivation and scope of the dataset. Without such a template, it is left to each reader to collate the information from the relevant publication or website, which is a tedious and time-consuming process. In some cases, the absence of a template to guide the documentation process can result in an unintentional omission of some important information. This paper addresses this simple but significant gap by proposing a datasheet template for documenting various aspects of subjective and objective quality assessment datasets for multimedia data. The contributions presented in this work aim to simplify the documentation process for existing and new datasets and improve their reproducibility. The proposed datasheet template is available on GitHub, along with a few sample datasheets of a few open-source audiovisual subjective and objective datasets.
翻译:多年来,学术界已创建并公开了许多主观与客观质量评估数据集。然而,目前缺乏标准流程来记录数据集的多方面信息,例如源序列详情、受试者数量、测试方法、编码设置等。这类信息对数据集用户至关重要,能够帮助他们快速理解数据集的动机与适用范围。缺乏此类模板意味着用户需从相关论文或网站自行整理信息,这一过程繁琐且耗时。在某些情况下,缺乏指导文档编制的模板还可能导致重要信息被无意遗漏。本文针对这一简单但关键的问题,提出了一种多媒体数据主观与客观质量评估数据集的数据表模板。本工作旨在简化现有及新建数据集的文档编制流程,并提升其可重复性。所提出的数据表模板已开源至GitHub,并附有若干开源音视频主观与客观数据集的数据表示例。