DanmakuTPPBench：用于时序点过程建模与理解的多模态基准 (DanmakuTPPBench: A Multi-modal Benchmark for Temporal Point Process Modeling and Understanding)

We introduce DanmakuTPPBench, a comprehensive benchmark designed to advance multi-modal Temporal Point Process (TPP) modeling in the era of Large Language Models (LLMs). While TPPs have been widely studied for modeling temporal event sequences, existing datasets are predominantly unimodal, hindering progress in models that require joint reasoning over temporal, textual, and visual information. To address this gap, DanmakuTPPBench comprises two complementary components: (1) DanmakuTPP-Events, a novel dataset derived from the Bilibili video platform, where user-generated bullet comments (Danmaku) naturally form multi-modal events annotated with precise timestamps, rich textual content, and corresponding video frames; (2) DanmakuTPP-QA, a challenging question-answering dataset constructed via a novel multi-agent pipeline powered by state-of-the-art LLMs and multi-modal LLMs (MLLMs), targeting complex temporal-textual-visual reasoning. We conduct extensive evaluations using both classical TPP models and recent MLLMs, revealing significant performance gaps and limitations in current methods' ability to model multi-modal event dynamics. Our benchmark establishes strong baselines and calls for further integration of TPP modeling into the multi-modal language modeling landscape. Project page: https://github.com/FRENKIE-CHIANG/DanmakuTPPBench

翻译：我们提出了DanmakuTPPBench，这是一个旨在推动大语言模型时代多模态时序点过程建模的综合基准。尽管时序点过程在建模时序事件序列方面已得到广泛研究，但现有数据集主要是单模态的，这阻碍了需要联合推理时序、文本和视觉信息的模型的发展。为填补这一空白，DanmakuTPPBench包含两个互补的组成部分：（1）DanmakuTPP-Events，一个源自Bilibili视频平台的新型数据集，其中用户生成的弹幕自然形成了多模态事件，并标注有精确的时间戳、丰富的文本内容及对应的视频帧；（2）DanmakuTPP-QA，一个通过由先进的大语言模型和多模态大语言模型驱动的新型多智能体流程构建的、具有挑战性的问答数据集，专注于复杂的时序-文本-视觉推理。我们使用经典的时序点过程模型和近期的多模态大语言模型进行了广泛评估，揭示了当前方法在建模多模态事件动态方面存在的显著性能差距和局限性。我们的基准建立了强有力的基线，并呼吁将时序点过程建模进一步整合到多模态语言建模的领域中。项目页面：https://github.com/FRENKIE-CHIANG/DanmakuTPPBench

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日