Many density estimation techniques for 3D human motion prediction require a significant amount of inference time, often exceeding the duration of the predicted time horizon. To address the need for faster density estimation for 3D human motion prediction, we introduce a novel flow-based method for human motion prediction called CacheFlow. Unlike previous conditional generative models that suffer from poor time efficiency, CacheFlow takes advantage of an unconditional flow-based generative model that transforms a Gaussian mixture into the density of future motions. The results of the computation of the flow-based generative model can be precomputed and cached. Then, for conditional prediction, we seek a mapping from historical trajectories to samples in the Gaussian mixture. This mapping can be done by a much more lightweight model, thus saving significant computation overhead compared to a typical conditional flow model. In such a two-stage fashion and by caching results from the slow flow model computation, we build our CacheFlow without loss of prediction accuracy and model expressiveness. This inference process is completed in approximately one millisecond, making it 4 times faster than previous VAE methods and 30 times faster than previous diffusion-based methods on standard benchmarks such as Human3.6M and AMASS datasets. Furthermore, our method demonstrates improved density estimation accuracy and comparable prediction accuracy to a SOTA method on Human3.6M. Our code and models are available at https://github.com/meaten/CacheFlow.
翻译:许多用于三维人体运动预测的密度估计技术需要大量推理时间,通常超过预测时间范围的长度。为满足三维人体运动预测对快速密度估计的需求,我们提出了一种新颖的基于流的人体运动预测方法,称为CacheFlow。与先前时间效率低下的条件生成模型不同,CacheFlow利用一个无条件的基于流的生成模型,将高斯混合分布转换为未来运动的密度分布。该基于流的生成模型的计算结果可被预先计算并缓存。随后,对于条件预测,我们寻找从历史轨迹到高斯混合分布中样本的映射关系。这种映射可通过更轻量级的模型实现,从而相比典型的条件流模型显著节省计算开销。通过这种两阶段方式并缓存缓慢流模型的计算结果,我们构建的CacheFlow在保持预测精度和模型表达能力的同时,实现了高效推理。该推理过程仅需约一毫秒,在Human3.6M和AMASS等标准基准测试中,比先前VAE方法快4倍,比基于扩散的方法快30倍。此外,我们的方法在Human3.6M数据集上展示了改进的密度估计精度和与当前最优方法相当的预测精度。我们的代码和模型已在https://github.com/meaten/CacheFlow开源。