People grasp flexible visual concepts from a few examples. We explore a neurosymbolic system that learns how to infer programs that capture visual concepts in a domain-general fashion. We introduce Template Programs: programmatic expressions from a domain-specific language that specify structural and parametric patterns common to an input concept. Our framework supports multiple concept-related tasks, including few-shot generation and co-segmentation through parsing. We develop a learning paradigm that allows us to train networks that infer Template Programs directly from visual datasets that contain concept groupings. We run experiments across multiple visual domains: 2D layouts, Omniglot characters, and 3D shapes. We find that our method outperforms task-specific alternatives, and performs competitively against domain-specific approaches for the limited domains where they exist.
翻译:人们能够从少量示例中掌握灵活的视觉概念。我们探索了一个神经符号系统,该系统学习如何以领域通用的方式推断捕获视觉概念的程序。我们引入了模板程序:来自领域特定语言的程序性表达式,用于指定输入概念中常见的结构和参数模式。我们的框架支持多种与概念相关的任务,包括通过解析进行少样本生成和共分割。我们开发了一种学习范式,使我们能够训练直接从包含概念分组的视觉数据集中推断模板程序的网络。我们在多个视觉领域(二维布局、Omniglot字符和三维形状)上进行了实验。我们发现,我们的方法优于任务特定的替代方案,并在存在领域特定方法的有限领域中与其表现具有竞争力。