EasyHOI：释放大模型在自然场景下重建手-物交互的潜力 (EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild)

Our work aims to reconstruct hand-object interactions from a single-view image, which is a fundamental but ill-posed task. Unlike methods that reconstruct from videos, multi-view images, or predefined 3D templates, single-view reconstruction faces significant challenges due to inherent ambiguities and occlusions. These challenges are further amplified by the diverse nature of hand poses and the vast variety of object shapes and sizes. Our key insight is that current foundational models for segmentation, inpainting, and 3D reconstruction robustly generalize to in-the-wild images, which could provide strong visual and geometric priors for reconstructing hand-object interactions. Specifically, given a single image, we first design a novel pipeline to estimate the underlying hand pose and object shape using off-the-shelf large models. Furthermore, with the initial reconstruction, we employ a prior-guided optimization scheme, which optimizes hand pose to comply with 3D physical constraints and the 2D input image content. We perform experiments across several datasets and show that our method consistently outperforms baselines and faithfully reconstructs a diverse set of hand-object interactions. Here is the link of our project page: https://lym29.github.io/EasyHOI-page/

翻译：本研究旨在从单视角图像重建手-物交互，这是一个基础性但病态的任务。与基于视频、多视角图像或预定义三维模板的重建方法不同，单视角重建因固有的歧义性和遮挡问题面临巨大挑战。手部姿态的多样性以及物体形状与尺寸的庞大差异进一步加剧了这些困难。我们的核心洞见在于，当前用于分割、修复和三维重建的基础模型能够稳健地泛化至自然场景图像，这可为重建手-物交互提供强有力的视觉与几何先验。具体而言，给定单张图像，我们首先设计了一种新颖的流程，利用现成的大模型估计底层的手部姿态与物体形状。进一步地，基于初始重建结果，我们采用先验引导的优化策略，通过优化手部姿态使其符合三维物理约束与二维输入图像内容。我们在多个数据集上进行实验，结果表明本方法在各类手-物交互场景中均能稳定超越基线方法，并实现忠实重建。项目页面链接：https://lym29.github.io/EasyHOI-page/

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日