Abstract reasoning poses significant challenges to artificial intelligence algorithms, demanding a cognitive ability beyond that required for perceptual tasks. In this study, we introduce the Cross-Feature Network (CFN), a novel framework designed to separately extract concepts and features from images. This framework utilizes the responses of features to concepts as representations for reasoning, particularly in addressing the Bongard-Logo problem. By integrating an Expectation-Maximization process between the extracted concepts and features within the CFN, we have achieved notable results, albeit with certain limitations. To overcome these limitations, we propose the Triple-CFN, an efficient model that maximizes feature extraction from images and demonstrates effectiveness in both the Bongard-Logo and Raven's Progressive Matrices (RPM) problems. Furthermore, we introduce Meta Triple-CFN, an advanced version of Triple-CFN, which explicitly constructs a concept space tailored for RPM problems. This ensures high accuracy of reasoning and interpretability of the concepts involved. Overall, this work explores innovative network designs for abstract reasoning, thereby advancing the frontiers of machine intelligence.
翻译:抽象推理对人工智能算法提出了重大挑战,其所需的认知能力超越了感知任务的要求。本研究引入了交叉特征网络(CFN),这是一种旨在从图像中分别提取概念和特征的新型框架。该框架利用特征对概念的响应作为推理的表征,特别是在解决Bongard-Logo问题中。通过在CFN内部提取的概念与特征之间集成一个期望最大化过程,我们取得了显著成果,尽管存在某些局限性。为了克服这些局限性,我们提出了三重CFN,这是一种高效的模型,能最大化地从图像中提取特征,并在Bongard-Logo和瑞文渐进矩阵(RPM)问题中均展现出有效性。此外,我们引入了元三重CFN,这是三重CFN的进阶版本,它显式地构建了一个专为RPM问题定制的概念空间。这确保了推理的高准确性以及所涉及概念的可解释性。总体而言,本工作探索了用于抽象推理的创新网络设计,从而推动了机器智能的前沿发展。