3D dynamic point cloud (DPC) compression relies on mining its temporal context, which faces significant challenges due to DPC's sparsity and non-uniform structure. Existing methods are limited in capturing sufficient temporal dependencies. Therefore, this paper proposes a learning-based DPC compression framework via hierarchical block-matching-based inter-prediction module to compensate and compress the DPC geometry in latent space. Specifically, we propose a hierarchical motion estimation and motion compensation (Hie-ME/MC) framework for flexible inter-prediction, which dynamically selects the granularity of optical flow to encapsulate the motion information accurately. To improve the motion estimation efficiency of the proposed inter-prediction module, we further design a KNN-attention block matching (KABM) network that determines the impact of potential corresponding points based on the geometry and feature correlation. Finally, we compress the residual and the multi-scale optical flow with a fully-factorized deep entropy model. The experiment result on the MPEG-specified Owlii Dynamic Human Dynamic Point Cloud (Owlii) dataset shows that our framework outperforms the previous state-of-the-art methods and the MPEG standard V-PCC v18 in inter-frame low-delay mode.
翻译:三维动态点云(DPC)压缩依赖于对其时间上下文的挖掘,但由于DPC的稀疏性和非均匀结构,这一任务面临重大挑战。现有方法在捕获足够的时间依赖性方面存在局限。因此,本文提出了一种基于学习的DPC压缩框架,该框架通过基于层级块匹配的帧间预测模块,在潜空间中补偿和压缩DPC几何信息。具体而言,我们提出了一种用于灵活帧间预测的层级运动估计与运动补偿(Hie-ME/MC)框架,该框架动态选择光流粒度以精确封装运动信息。为了提高所提帧间预测模块的运动估计效率,我们进一步设计了一种KNN注意力块匹配(KABM)网络,该网络基于几何与特征相关性确定潜在对应点的影响程度。最后,我们采用全因子化深度熵模型对残差和多尺度光流进行压缩。在MPEG指定的Owlii动态人体动态点云(Owlii)数据集上的实验结果表明,在帧间低延迟模式下,我们的框架优于先前的最先进方法以及MPEG标准V-PCC v18。