DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Human motion, inherently continuous and dynamic, presents significant challenges for generative models. Despite their dominance, discrete quantization methods, such as VQ-VAEs, suffer from inherent limitations, including restricted expressiveness and frame-wise noise artifacts. Continuous approaches, while producing smoother and more natural motions, often falter due to high-dimensional complexity and limited training data. To resolve this "discord" between discrete and continuous representations, we introduce DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding, a novel method that decodes discrete motion tokens into continuous motion through rectified flow. By employing an iterative refinement process in the continuous space, DisCoRD captures fine-grained dynamics and ensures smoother and more natural motions. Compatible with any discrete-based framework, our method enhances naturalness without compromising faithfulness to the conditioning signals. Extensive evaluations demonstrate that DisCoRD achieves state-of-the-art performance, with FID of 0.032 on HumanML3D and 0.169 on KIT-ML. These results solidify DisCoRD as a robust solution for bridging the divide between discrete efficiency and continuous realism. Our project page is available at: https://whwjdqls.github.io/discord.github.io/.

翻译：人体运动本质上是连续且动态的，这对生成模型提出了重大挑战。尽管占据主导地位，但离散量化方法（如VQ-VAE）存在固有的局限性，包括表达能力受限和逐帧噪声伪影。连续方法虽然能产生更平滑、更自然的运动，但由于高维复杂性和有限的训练数据，常常表现不佳。为了解决离散与连续表示之间的这种"不协调"，我们提出了DisCoRD：通过整流流解码将离散标记转化为连续运动。这是一种新颖的方法，通过整流流将离散运动标记解码为连续运动。通过在连续空间中采用迭代细化过程，DisCoRD能够捕捉细粒度的动态并确保更平滑、更自然的运动。我们的方法与任何基于离散的框架兼容，在增强自然性的同时，不损害对条件信号的忠实度。广泛的评估表明，DisCoRD实现了最先进的性能，在HumanML3D上的FID为0.032，在KIT-ML上的FID为0.169。这些结果巩固了DisCoRD作为弥合离散效率与连续真实性之间鸿沟的稳健解决方案。我们的项目页面位于：https://whwjdqls.github.io/discord.github.io/。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日