3D dynamic point cloud (DPC) compression relies on mining its temporal context, which faces significant challenges due to DPC's sparsity and non-uniform structure. Existing methods are limited in capturing sufficient temporal dependencies. Therefore, this paper proposes a learning-based DPC compression framework via hierarchical block-matching-based inter-prediction module to compensate and compress the DPC geometry in latent space. Specifically, we propose a hierarchical motion estimation and motion compensation (Hie-ME/MC) framework for flexible inter-prediction, which dynamically selects the granularity of optical flow to encapsulate the motion information accurately. To improve the motion estimation efficiency of the proposed inter-prediction module, we further design a KNN-attention block matching (KABM) network that determines the impact of potential corresponding points based on the geometry and feature correlation. Finally, we compress the residual and the multi-scale optical flow with a fully-factorized deep entropy model. The experiment result on the MPEG-specified Owlii Dynamic Human Dynamic Point Cloud (Owlii) dataset shows that our framework outperforms the previous state-of-the-art methods and the MPEG standard V-PCC v18 in inter-frame low-delay mode.
翻译:三维动态点云(DPC)压缩依赖于对其时域上下文的挖掘,但由于DPC的稀疏性和非均匀结构,这一过程面临重大挑战。现有方法在捕获充分的时间依赖性方面存在局限。为此,本文提出一种基于学习的DPC压缩框架,通过层次化块匹配的帧间预测模块,在潜在空间中对DPC几何结构进行补偿与压缩。具体而言,我们提出了用于灵活帧间预测的层次化运动估计与运动补偿(Hie-ME/MC)框架,该框架能动态选择光流粒度以精确封装运动信息。为提升所提帧间预测模块的运动估计效率,我们进一步设计了KNN注意力块匹配(KABM)网络,该网络基于几何与特征相关性确定潜在对应点的影响程度。最终,我们采用全因式分解深度熵模型对残差及多尺度光流进行压缩。在MPEG标准Owlii动态人体点云(Owlii)数据集上的实验结果表明,本框架在帧间低延迟模式下优于先前最先进方法及MPEG标准V-PCC v18。