Co-manipulation requires multiple humans to synchronize their motions with a shared object while ensuring reasonable interactions, maintaining natural poses, and preserving stable states. However, most existing motion generation approaches are designed for single-character scenarios or fail to account for payload-induced dynamics. In this work, we propose a flow-matching framework that ensures the generated co-manipulation motions align with the intended goals while maintaining naturalness and effectiveness. Specifically, we first introduce a generative model that derives explicit manipulation strategies from the object's affordance and spatial configuration, which guide the motion flow toward successful manipulation. To improve motion quality, we then design an adversarial interaction prior that promotes natural individual poses and realistic inter-person interactions during co-manipulation. In addition, we also incorporate a stability-driven simulation into the flow matching process, which refines unstable interaction states through sampling-based optimization and directly adjusts the vector field regression to promote more effective manipulation. The experimental results demonstrate that our method achieves higher contact accuracy, lower penetration, and better distributional fidelity compared to state-of-the-art human-object interaction baselines. The code is available at https://github.com/boycehbz/StaCOM.
翻译:协同操作要求多个人类在共享物体的同时保持运动同步,确保合理的交互、维持自然的姿态,并保持稳定状态。然而,现有的大多数运动生成方法针对单角色场景设计,或未能考虑负载引起的动力学问题。本文提出一种基于流匹配的框架,确保生成的协同操作运动在保持自然性与有效性的同时,与预期目标对齐。具体而言,我们首先引入一个生成模型,从物体的可供性与空间配置中推导出显式操作策略,从而引导运动流实现成功操作。为提升运动质量,我们设计了一种对抗性交互先验,在协同操作中促使生成自然的个体姿态及真实的人人交互。此外,我们将基于稳定性的模拟集成到流匹配过程中,通过基于采样的优化修正不稳定交互状态,并直接调整向量场回归以促进更高效的操作。实验结果表明,与当前最先进的人-物交互基线方法相比,我们的方法在接触精度、穿透率降低和分布保真度方面均取得了更优性能。代码已开源至 https://github.com/boycehbz/StaCOM。