Generating semantically aligned human motion from textual descriptions has made rapid progress, but ensuring both semantic and physical realism in motion remains a challenge. In this paper, we introduce the Distortion-aware Motion Calibrator (DMC), a post-hoc module that refines physically implausible motions (e.g., foot floating) while preserving semantic consistency with the original textual description. Rather than relying on complex physical modeling, we propose a self-supervised and data-driven approach, whereby DMC learns to obtain physically plausible motions when an intentionally distorted motion and the original textual descriptions are given as inputs. We evaluate DMC as a post-hoc module to improve motions obtained from various text-to-motion generation models and demonstrate its effectiveness in improving physical plausibility while enhancing semantic consistency. The experimental results show that DMC reduces FID score by 42.74% on T2M and 13.20% on T2M-GPT, while also achieving the highest R-Precision. When applied to high-quality models like MoMask, DMC improves the physical plausibility of motions by reducing penetration by 33.0% as well as adjusting floating artifacts closer to the ground-truth reference. These results highlight that DMC can serve as a promising post-hoc motion refinement framework for any kind of text-to-motion models by incorporating textual semantics and physical plausibility.
翻译:从文本描述生成语义对齐的人体动作已取得快速进展,但确保动作同时具备语义和物理真实性仍具挑战。本文提出失真感知运动校准器(DMC),该后处理模块能在保持与原始文本描述语义一致性的同时,优化物理上不合理的动作(如足部漂浮)。我们摒弃复杂物理建模,提出一种自监督数据驱动方法:当输入故意扭曲的动作与原始文本描述时,DMC能学习获得物理合理的动作。我们将DMC作为后处理模块评估其改进多种文本到动作生成模型的效果,证明其在提升物理合理性的同时能增强语义一致性。实验表明,DMC在T2M数据集上使FID分数降低42.74%,在T2M-GPT上降低13.20%,同时获得最高R-Precision精度。应用于MoMask等高质量模型时,DMC通过减少33.0%的穿透现象并将漂浮伪影调整至更接近真实参考,显著提升动作的物理合理性。这些结果证明,DMC通过融合文本语义与物理合理性,可成为适用于各类文本到动作模型的有前景的后处理运动优化框架。