Rendering realistic human-object interactions (HOIs) from sparse-view inputs is a challenging yet crucial task for various real-world applications. Existing methods often struggle to simultaneously achieve high rendering quality, physical plausibility, and computational efficiency. To address these limitations, we propose HOGS (Human-Object Rendering via 3D Gaussian Splatting), a novel framework for efficient HOI rendering with physically plausible geometric constraints from sparse views. HOGS represents both humans and objects as dynamic 3D Gaussians. Central to HOGS is a novel optimization process that operates directly on these Gaussians to enforce geometric consistency (i.e., preventing inter-penetration or floating contacts) to achieve physical plausibility. To support this core optimization under sparse-view ambiguity, our framework incorporates two pre-trained modules: an optimization-guided Human Pose Refiner for robust estimation under sparse-view occlusions, and a Human-Object Contact Predictor that efficiently identifies interaction regions to guide our novel contact and separation losses. Extensive experiments on both human-object and hand-object interaction datasets demonstrate that HOGS achieves state-of-the-art rendering quality and maintains high computational efficiency.
翻译:从稀疏视角输入中渲染逼真的人体-物体交互是众多实际应用中的一项关键挑战。现有方法往往难以同时实现高渲染质量、物理可信性和计算效率。针对这些局限,我们提出HOGS(基于3D高斯泼溅的人体-物体渲染),这是一种新型框架,能从稀疏视角中高效渲染具备物理可信几何约束的人体-物体交互。HOGS将人体和物体均表示为动态3D高斯体。其核心在于一种直接作用于这些高斯体的新型优化过程,通过强制几何一致性(即防止相互穿透或悬浮接触)来实现物理可信性。为支持稀疏视角歧义下的核心优化,我们的框架整合了两个预训练模块:用于稀疏视角遮挡下鲁棒姿态估计的优化引导型人体姿态精化器,以及高效识别交互区域以引导新型接触与分离损失的人体-物体接触预测器。在人体-物体和手部-物体交互数据集上的大量实验表明,HOGS达到了最先进的渲染质量并保持了高计算效率。