Robotic grasping from single-view observations remains a critical challenge in manipulation. However, existing methods still struggle to generate reliable grasp candidates and stably evaluate grasp feasibility under incomplete geometric information. To address these limitations, we present SuperGrasp, a new two-stage framework for single-view parallel-jaw grasping. In the first stage, we introduce a Similarity Matching Module that efficiently retrieves valid and diverse grasp candidates by matching the input single-view point cloud with a precomputed primitive dataset based on superquadric coefficients. In the second stage, we propose E-RNet, an end-to-end network that expands the grasp-aware region and takes the initial grasp closure region as a local anchor region, capturing the contextual relationship between the local region and its surrounding spatial neighborhood, thereby enabling more accurate and reliable grasp evaluation and introducing small-range local refinement to improve grasp adaptability. To enhance generalization, we construct a primitive dataset containing 1.2k standard geometric primitives for similarity matching and collect a point cloud dataset of 100k samples from 124 objects, annotated with stable grasp labels for network training. Extensive experiments in both simulation and real-world environments demonstrate that our method achieves stable grasping performance and good generalization across novel objects and clutter scenes.
翻译:单视角观测下的机器人抓取仍是操作任务中的关键挑战。然而,现有方法在几何信息不完整的情况下,仍需依赖不可靠的抓取候选生成,且难以稳定评估抓取可行性。针对上述局限,我们提出SuperGrasp——一种新型两阶段单视角平行夹爪抓取框架。第一阶段引入相似性匹配模块,通过将输入单视角点云与基于超二次曲面系数预计算的原型数据集进行匹配,高效检索有效且多样化的抓取候选。第二阶段提出E-RNet端到端网络,该网络扩展抓取感知区域,将初始抓取闭合区域作为局部锚点,捕获局部区域与周围空间邻域的上下文关联,从而实现更精准可靠的抓取评估,并通过小范围局部优化提升抓取适应性。为增强泛化能力,我们构建包含1.2k个标准几何原型的原型数据集用于相似性匹配,并从124个物体中采集10万样本构建点云数据集,标注稳定抓取标签用于网络训练。仿真与真实环境中的大量实验表明,本方法在新型物体和杂乱场景中均能实现稳定抓取性能与良好泛化能力。