Vector-quantized motion tokenizers provide a compact discrete interface for text-to-motion generation, but most motion-code priors treat code indices as unordered categorical labels. This view overlooks a key property of motion codes: they are decoder-bound prototypes of physical movement, and their learned codebooks can carry meaningful local kinematic geometry. We verify this property through codebook diagnostics. Distances between learned PartVQ group-specific codes align with local motion-prototype distances, shuffled controls remove this alignment, and replacing codes with progressively farther neighbors induces monotonically larger decoded motion changes. These results show that motion codebooks exhibit measurable, non-random, and decoder-causal geometry. Based on this observation, we propose \textbf{MoGeFlow}, a text-to-motion model that generates through motion codebook geometry. MoGeFlow represents each motion-code frame as a structured set of PartVQ group-specific code embeddings, learns a text-conditioned continuous flow over these frame states, and projects terminal states back to valid motion codes for frozen decoding. This preserves the compactness and validity of discrete tokenization while replacing categorical code prediction with geometry-aware codebook-space generation. Experiments set new state of the art in R-Precision on HumanML3D and KIT-ML, achieve the best HumanML3D MultiModal Distance and KIT-ML FID among generated methods, and obtain the best MotionMillion R@1, R@2, R@3, and FID under the benchmark protocol.
翻译:向量量化的运动分词器为文本到动作生成提供了紧凑的离散接口,但大多数运动码先验将码索引视为无序的分类标签。这种视角忽视了运动码的关键特性:它们是物理运动的解码器绑定原型,其学习到的码本能够承载有意义的局部运动学几何。我们通过码本诊断验证了这一特性。学习的PartVQ组特定码之间的距离与局部运动原型距离一致,打乱控制会消除这种对齐,且用越来越远的近邻码替换原始码会导致单调递增的解码运动变化。这些结果表明运动码本展现出可测量、非随机且具有解码器因果性的几何结构。基于这一观察,我们提出\textbf{MoGeFlow},一种通过运动码本几何生成文本到动作的模型。MoGeFlow将每个运动码帧表示为PartVQ组特定码嵌入的结构化集合,学习这些帧状态上的文本条件连续流,并将终端状态投影回有效运动码以进行冻结解码。该方法在保留离散分词紧凑性和有效性的同时,用几何感知的码本空间生成替代了分类码预测。实验在HumanML3D和KIT-ML上创下R-Precision新纪录,在生成方法中取得最佳HumanML3D多模态距离和KIT-ML FID,并在基准协议下获得MotionMillion最优的R@1、R@2、R@3和FID指标。