The automated evaluation of cognitive status utilizing multimedia technologies presents a promising frontier in early dementia diagnosis. However, the development of robust machine learning models for cognitive impairment detection is frequently hindered by the scarcity of large-scale, strictly synchronized, and clinically validated multimodal datasets. To bridge this critical gap, we introduce the CogPic database, a comprehensive multimodal benchmark meticulously designed for fine-grained cognitive impairment detection. The dataset comprises strictly synchronized audio, visual, and linguistic data continuously collected from 574 participants during a naturalistic picture description task. To establish highly reliable diagnostic ground truth, expert clinical neuropsychologists conducted exhaustive evaluations, stratifying participants into distinct cognitive groups through a comprehensive clinical consensus. Consequently, CogPic stands as the largest, most modality-rich, and most meticulously evaluated dataset of its kind to date. By conducting extensive benchmark experiments on the CogPic dataset, we establish an exceptionally robust, unbiased, and clinically generalizable foundation to propel future multimedia research in automated cognitive health assessment. Detailed information and access application procedures for our CogPic database are available at https://cogpic.github.io/.
翻译:利用多媒体技术自动评估认知状态,为早期痴呆症诊断开辟了充满前景的前沿领域。然而,开发稳健的机器学习模型用于认知障碍检测,常因缺乏大规模、严格同步且经临床验证的多模态数据集而受到阻碍。为填补这一关键空白,我们引入了CogPic数据库,这是一个专为细粒度认知障碍检测而设计的综合性多模态基准。该数据集包含在自然图片描述任务中从574名参与者处连续收集的严格同步的音频、视觉和语言数据。为建立高度可靠的诊断金标准,临床神经心理学专家进行了详尽的评估,并通过全面的临床共识将参与者分层至不同的认知群体。因此,CogPic是迄今为止规模最大、模态最丰富且评估最严谨的同类型数据集。通过在CogPic数据集上开展广泛的基准实验,我们为未来自动认知健康评估的多媒体研究奠定了极其稳健、无偏且具备临床泛化性的基础。有关CogPic数据库的详细信息及访问申请流程,请访问https://cogpic.github.io/。