Diffusion-based generative models have recently emerged as powerful solutions for high-quality synthesis in multiple domains. Leveraging the bidirectional Markov chains, diffusion probabilistic models generate samples by inferring the reversed Markov chain based on the learned distribution mapping at the forward diffusion process. In this work, we propose Modiff, a conditional paradigm that benefits from the denoising diffusion probabilistic model (DDPM) to tackle the problem of realistic and diverse action-conditioned 3D skeleton-based motion generation. We are a pioneering attempt that uses DDPM to synthesize a variable number of motion sequences conditioned on a categorical action. We evaluate our approach on the large-scale NTU RGB+D dataset and show improvements over state-of-the-art motion generation methods.
翻译:扩散生成模型近年来在多个领域成为高质量合成的强大解决方案。利用双向马尔可夫链,扩散概率模型通过在前向扩散过程中基于学习到的分布映射推断逆向马尔可夫链来生成样本。本文提出一种条件范式——莫迪夫(Modiff),利用去噪扩散概率模型(DDPM)解决基于3D骨架的拟真且多样化的动作条件运动生成问题。我们是首个尝试利用DDPM根据类别动作合成可变长度运动序列的研究。我们在大规模NTU RGB+D数据集上评估了该方法,并展示了相较于现有最优运动生成方法的性能提升。