M3Act: Learning from Synthetic Human Group Activities

The study of complex human interactions and group activities has become a focal point in human-centric computer vision. However, progress in related tasks is often hindered by the challenges of obtaining large-scale labeled datasets from real-world scenarios. To address the limitation, we introduce M3Act, a synthetic data generator for multi-view multi-group multi-person human atomic actions and group activities. Powered by Unity Engine, M3Act features multiple semantic groups, highly diverse and photorealistic images, and a comprehensive set of annotations, which facilitates the learning of human-centered tasks across single-person, multi-person, and multi-group conditions. We demonstrate the advantages of M3Act across three core experiments. The results suggest our synthetic dataset can significantly improve the performance of several downstream methods and replace real-world datasets to reduce cost. Notably, M3Act improves the state-of-the-art MOTRv2 on DanceTrack dataset, leading to a hop on the leaderboard from 10th to 2nd place. Moreover, M3Act opens new research for controllable 3D group activity generation. We define multiple metrics and propose a competitive baseline for the novel task. Our code and data are available at our project page: http://cjerry1243.github.io/M3Act.

翻译：复杂人际交互与群体活动研究已成为以人为中心的计算机视觉领域的焦点。然而，相关任务的进展常受限于从真实场景获取大规模标注数据集的挑战。为解决此问题，我们提出M3Act——一种用于多视角、多群体、多人原子动作与群体活动的合成数据生成器。该生成器基于Unity引擎构建，具备多语义群体、高多样性与逼真图像及全面标注集，可促进单人、多人及多群体条件下以人为中心任务的学习。我们通过三项核心实验展示了M3Act的优势。结果表明，我们的合成数据集能显著提升多种下游方法的性能，并可替代真实世界数据集以降低成本。值得注意的是，M3Act在DanceTrack数据集上改进了当前最优方法MOTRv2，使其在排行榜上从第10位跃升至第2位。此外，M3Act为可控三维群体活动生成开辟了新研究方向。我们定义了多项指标并为该新任务提出了竞争性基线。我们的代码与数据已发布于项目页面：http://cjerry1243.github.io/M3Act。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日