People grasp flexible visual concepts from a few examples. We explore a neurosymbolic system that learns how to infer programs that capture visual concepts in a domain-general fashion. We introduce Template Programs: programmatic expressions from a domain-specific language that specify structural and parametric patterns common to an input concept. Our framework supports multiple concept-related tasks, including few-shot generation and co-segmentation through parsing. We develop a learning paradigm that allows us to train networks that infer Template Programs directly from visual datasets that contain concept groupings. We run experiments across multiple visual domains: 2D layouts, Omniglot characters, and 3D shapes. We find that our method outperforms task-specific alternatives, and performs competitively against domain-specific approaches for the limited domains where they exist.
翻译:人们能通过少量示例掌握灵活的视觉概念。我们探索了一种神经符号系统,该系统以领域通用的方式学习如何推断捕捉视觉概念的程序。我们引入模板程序:一种领域专用语言中的程序化表达式,用于指定输入概念中常见的结构和参数模式。我们的框架支持多种概念相关任务,包括少样本生成和基于解析的共分割。我们开发了一种学习范式,可训练网络直接从包含概念分组的视觉数据集中推断模板程序。我们在多个视觉领域(2D布局、Omniglot字符和3D形状)进行了实验。结果表明,我们的方法优于任务特定的替代方案,并在存在领域特定方法的有限领域中与其表现相当。