Endowing machines with abstract reasoning ability has been a long-term research topic in artificial intelligence. Raven's Progressive Matrix (RPM) is widely used to probe abstract visual reasoning in machine intelligence, where models need to understand the underlying rules and select the missing bottom-right images out of candidate sets to complete image matrices. The participators can display powerful reasoning ability by inferring the underlying attribute-changing rules and imagining the missing images at arbitrary positions. However, existing solvers can hardly manifest such an ability in realistic RPM problems. In this paper, we propose a conditional generative model to solve answer generation problems through Rule AbstractIon and SElection (RAISE) in the latent space. RAISE encodes image attributes as latent concepts and decomposes underlying rules into atomic rules by means of concepts, which are abstracted as global learnable parameters. When generating the answer, RAISE selects proper atomic rules out of the global knowledge set for each concept and composes them into the integrated rule of an RPM. In most configurations, RAISE outperforms the compared generative solvers in tasks of generating bottom-right and arbitrary-position answers. We test RAISE in the odd-one-out task and two held-out configurations to demonstrate how learning decoupled latent concepts and atomic rules helps find the image breaking the underlying rules and handle RPMs with unseen combinations of rules and attributes.
翻译:赋予机器抽象推理能力一直是人工智能领域的长期研究课题。雷文渐进矩阵(RPM)被广泛用于探测机器智能中的抽象视觉推理,模型需要理解底层规则并从候选集中选出缺失的右下角图像以完成图像矩阵。参与者通过推断属性变化规则并想象任意位置的缺失图像,能够展现出强大的推理能力。然而,现有求解器在真实RPM问题中难以体现这种能力。本文提出一种条件生成模型,通过在潜在空间中进行规则抽象与选择(RAISE)来解决答案生成问题。RAISE将图像属性编码为潜在概念,并借助概念将底层规则分解为原子规则,这些原子规则被抽象为全局可学习参数。在生成答案时,RAISE为每个概念从全局知识集中选取合适的原子规则,并将其组合成RPM的完整规则。在多数配置下,RAISE在下角答案生成和任意位置答案生成任务中的表现优于对比的生成式求解器。我们通过"找不同"任务及两种保留配置测试RAISE,结果表明,解耦潜在概念与原子规则的学习有助于发现违反底层规则的图像,并处理规则与属性组合未见过的RPM问题。