Keypoint detection, integral to modern machine perception, faces challenges in few-shot learning, particularly when source data from the same distribution as the query is unavailable. This gap is addressed by leveraging sketches, a popular form of human expression, providing a source-free alternative. However, challenges arise in mastering cross-modal embeddings and handling user-specific sketch styles. Our proposed framework overcomes these hurdles with a prototypical setup, combined with a grid-based locator and prototypical domain adaptation. We also demonstrate success in few-shot convergence across novel keypoints and classes through extensive experiments.
翻译:关键点检测作为现代机器感知的核心任务,在少样本学习场景下面临显著挑战,尤其是在无法获得与查询数据同分布源数据的情况下。本研究通过引入草图——一种广泛使用的人类表达形式——作为无源数据的替代方案,以填补这一空白。然而,该方法在掌握跨模态嵌入和处理用户特定草图风格方面仍存在难点。我们提出的框架通过原型化架构设计,结合基于网格的定位器与原型域自适应策略,成功克服了这些障碍。大量实验进一步证明,该框架在新型关键点与类别的少样本收敛任务中均取得显著成效。