Reliable 3D mesh saliency ground truth (GT) is essential for human-centric visual modeling in virtual reality (VR). However, current 3D mesh saliency GT acquisition methods are generally consistent with 2D image methods, ignoring the differences between 3D geometry topology and 2D image array. Current VR eye-tracking pipelines rely on single ray sampling and Euclidean smoothing, triggering texture attention and signal leakage across gaps. This paper proposes a robust framework to address these limitations. We first introduce a view cone sampling (VCS) strategy, which simulates the human foveal receptive field via Gaussian-distributed ray bundles to improve sampling robustness for complex topologies. Furthermore, a hybrid Manifold-Euclidean constrained diffusion (HCD) algorithm is developed, fusing manifold geodesic constraints with Euclidean scales to ensure topologically-consistent saliency propagation. By mitigating "topological short-circuits" and aliasing, our framework provides a high-fidelity 3D attention acquisition paradigm that aligns with natural human perception, offering a more accurate and robust baseline for 3D mesh saliency research.
翻译:可靠的3D网格显著性真值对于虚拟现实中以人为中心的视觉建模至关重要。然而,当前3D网格显著性真值获取方法通常沿用2D图像方法,忽略了3D几何拓扑与2D图像阵列之间的本质差异。现有虚拟现实眼动追踪流程依赖单射线采样与欧几里得平滑,易引发纹理注意力干扰及跨越间隙的信号泄漏。本文提出一个鲁棒框架以解决这些局限。我们首先引入视锥采样策略,通过高斯分布的射线束模拟人类中央凹感受野,以提升复杂拓扑结构的采样鲁棒性。进一步,我们开发了混合流形-欧几里得约束扩散算法,融合流形测地约束与欧几里得尺度,确保显著性传播的拓扑一致性。通过抑制“拓扑短路”与混叠效应,本框架提供了符合人类自然感知的高保真3D注意力获取范式,为3D网格显著性研究提供了更精确、更鲁棒的基准。